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STREPTOCOCCUS PNEUMONIAE DNA SEQUENCES 

This invention provides DNA sequences from the 
Streptococcus pneumoniae genome, and methods of use of DNA 
fragments originating therefrom in a variety of biological 
and pharmaceutical applications. 

The recent emergence of widespread antibiotic 
resistance in common pathogenic bacterial species has 
justifiably alarmed the medical and research communities. 
Frequently these organisms are co-resistant to several 
different antibacterial agents. Particularly problematic has 
been the emergence and rapid spread of penicillin resistance 
in Streptococcus pneumoniae, which frequently causes upper 
respiratory tract infections. Resistance to penicillin in 
this organism can be due to modifications of one or more of 
the penicillin-binding proteins (PBPs) . Combating the 
phenomenon of increasing resistance to antibiotic agents 
among pathogenic organisms such as Streptococcus pneumoniae 
will require intensified research into the fundamental 
molecular biology of such organisms. Greater knowledge about 
the molecular biology of pathogenic organisms will lead to 
new antibacterial agents having novel and effective, actions. 

While inroads in the development of new antibiotics and 
new targets for antibiotic compounds have been made with a 
variety of microorganisms, progress has been less apparent 
in Streptococcus pneumoniae. In part, Streptococcus 
pneumoniae presents a special case because this organism is 
highly recombinogenic and readily takes up exogenous DNA 
from its surroundings. Thus, the need for new antibacterial 
compounds and new targets for antibacterial therapy in 
Streptococcus pneumoniae is more acute than in other 
organisms . 
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The present invention relates to the genome of S. 
pneumoniae . The genomic information disclosed by the present 
invention enables: (1) preparation of molecular 
5 hybridization probes for use in PCR amplification of genes 
and regulatory regions, physical mapping, sequencing, 
mutagenesis, and mutation analysis, (2) homology comparisons 
with the genomes and open reading frames (ORFs) of other 
organisms, (3) creation of specifically mutated strains of 

10 5. pneumoniae wherein the mutation is targeted to any site 
or sites in the DNA sequence disclosed herein, (4) 
identification of S. pneumoniae promoters and other gene 
regulatory sequences, (5) identification of proteins/ORFs 
encoded by S. pneumoniae , (6) identification of virulence 

15 genes in S. pneumoniae, (7) determination of the biological 
function of proteins/ORFs and RNAs encoded by S. pneumoniae, 
(8) production of kits useful for determining gene function 
in the cell, and kits for isolating and analyzing genes that 
are mutated in antibiotic resistant clinical isolates of S. 

20 pneumoniae , (9) production of proteins and RNAs encoded by 
5. pneumoniae, (10) production of antibodies against 
proteins and other antigens encoded by S. pneumoniae, (11) 
methods to identify compounds that bind to proteins and RNAs 
encoded by S. pneumoniae as potential new antibiotic 

2 5 compounds. 

In another embodiment the invention relates to 
substantially purified proteins encoded by the S. pneumoniae 
genome . 

30 Table 1 summarizes the proteins and nucleic acids 

disclosed herein, contigs, SEQ ID NO' s and predicted 
functions . 
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"Genome" refers to the full complement of chromosomal 
and extra-chromosomal DNA within a cell. The genome 
comprises the genetic blueprint for all proteins and RNAs 
encoded by the cell or organism. 
5 "ORF" (i.e. "open reading frame") designates a region 

of genomic DNA beginning with a Met or other initiation 
codon and terminating with a translation stop codon, 
potentially encoding a protein product. "Partial ORF" means 
a portion of an ORF as disclosed herein such that the 
10 initiation codon, the stop codon, or both are not disclosed. 

"DNA chip" or "Bio Chip" or "Bio DNA chip" refers to a 
solid matrix or support onto which is applied an array of 
oligonucleotides, or nucleotide sequences, or gene 
fragments, or genomic fragments, of S. pneumoniae which may 
15 further comprise a layer of 5. pneumoniae cells suspended 
thereover in a semisolid medium such as agar or agarose. 

"Consensus sequence" refers to an amino acid or 
nucleotide sequence that may suggest the biological function 
of a protein, DNA, or RNA molecule. Consensus sequences are 
20 identified by comparing proteins, RNAs, and gene homologs 
from different species. 

"Contiguous fragment building" or "Contiguous fragment" 
or "Contig" refers to the process and result, respectively, 
by which a fragment of DNA is assembled from smaller 
25 constituent DNA fragments by arranging the constituent 

pieces in their correct order and register such that the 
resulting contiguous fragment accurately depicts the native 
DNA sequence from which the smaller fragments originated. 
"Computer readable medium" includes, for example, a 
30 floppy disc, hard disc, random access memory, read only 
memory, and CD-ROM. 

The terms "cleavage" or "restriction" of DNA 
refers to the catalytic cleavage of the DNA with a 
restriction enzyme that acts only at certain sequences in 



WO 98/26072 



PCT/US97/22578 



the DNA (viz. sequence-specific er.donucleases) . The various 
restriction enzymes used herein are commercially available 
and their reaction conditions, cofactors, and other 
requirements are used in the manner well known to one of 
5 ordinary skill in the art. Appropriate buffers and substrate 
amounts for particular restriction enzymes are specified by 
the manufacturer or can be found in the literature. 

"Diagnostics" as used herein relates to in vitro 
or in vivo diagnosis for disease states or biological status 

10 in mammals, preferably humans. 

"Therapeutics" and "therapeutic/diagnostic 
combinations" means the treatment, or diagnosis and 
treatment, of disease states or biological status by in vivo 
administration to mammals, preferably humans, of 

15 compositions of the present invention, for example, 
antibodies . 

"Essential genes" or "essential ORFs" or 
"essential proteins" refer to genomic information or the 
protein (s) or RNAs encoded therefrom, which, when disrupted 

20 by knockout mutation, or by other mutation, produce 
inviability in cells harboring said mutation. 

"Non-essential genes" or "non-essential ORFs" or 
"non-essential proteins" refer to genomic information or the 
protein (s) or RNAs encoded therefrom, which, when disrupted 

25 by knockout mutation, or other mutation, do not result in 
inviability of cells harboring said mutation. 

"Minimal gene set" refers to a genus of about 256 
genes that are conserved among different bacteria such as M. 
genitalium and H. influenzae. The minimal gene set appears 

30 to be necessary and sufficient to sustain life. See e.g. A, 
Mushegian and E. Koonin, "A minimal gene set for cellular 
life derived by comparison of complete bacterial genomes" 
Proc. Nat. Acad. Sci. 93, 10268 - 273 (1996). 
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a nucleic acid mclecule described herein, wherein said 
fragment comprises a region of contiguity within said 
nucleic acid of at least 15 base pairs. The term may also 
5 refer to a peptide of at least 5 contiguous amino acid 
residues of a protein disclosed herein. 

The term "plasmid" refers to an extrachromosomal 
genetic element. The starting plasmids herein are either 
commercially available, publicly available on an 
10 unrestricted basis, or can be constructed from available 
plasmids in accordance with published procedures. In 
addition, equivalent plasmids to those described are known 
in the art and will be apparent to the ordinarily skilled 
artisan . 

15 "Recombinant DNA cloning vector" as used herein 

refers to any autonomously replicating agent, including, but 
not limited to, plasmids and phages, comprising a DNA 
molecule to which one or more additional DNA segments can or 
have been added. 

20 The term "recombinant DNA expression vector" as 

used herein refers to any recombinant DNA cloning vector, 
for example a plasmid or phage, in which a promoter and 
other regulatory elements are present to enable 
transcription of the inserted DNA. 

25 The term "vector" as used herein refers to a 

nucleic acid compound used for introducing exogenous DNA 
into host cells. A vector comprises a nucleotide sequence 
which may encode one or more protein molecules. Plasmids, 
cosmids, viruses, and bacteriophages, in the natural state 

3 0 or which have undergone recombinant engineering, are 
examples of commonly used vectors. 

The terms "complementary" or "complementarity" as 
used herein refers to the capacity of purine and pyrimidine 
nucleotides to associate through hydrogen bonding in double 
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stranded nucleic acid molecules. The following base pairs 
are complementary: guanine and cytosine; adenine and 
thymine; and adenine and uracil. 

"Oligonucleotide" refers to a short polymeric 
5 nucleotide chain comprising from about 2 to 25 nucleotides. 

"Isolated nucleic acid compound" refers to any RNA 
or DNA sequence, however constructed or synthesized, which 
is locationally distinct from its natural location. 

A "primer" is a nucleic acid fragment which 
10 functions as an initiating substrate for enzymatic or 
synthetic elongation of a nucleic acid molecule. 

The term "promoter" refers to a DNA sequence which 
directs transcription of DNA to RNA. 

A "probe" as used herein is a labeled nucleic acid 
15 compound which can be used to hybridize with another nucleic 
acid compound. 

The term "hybridization" or "hybridize" as used 
herein refers to the process by which a single-stranded 
nucleic acid molecule joins with a complementary strand 
20 through nucleotide base pairing. 

"Recorded" as used herein refers to a process for 
storing information on a computer readable medium. 

"Substantially identical" means a sequence having 
sufficient homology to hybridize under high stringency 
25 conditions and/or at least 90% identity at the nucleotide or 
amino acid sequence level to a sequence disclosed herein. 

"Substantially purified" when used in reference to 
a protein or peptide means that the molecule has been 
largely, but not necessarily wholly, separated and purified 
3 0 from other cellular and non-cellular components. Typically a 
protein is substantially pure when it is at least about 60% 
by weight, free from other naturally occurring organic 
molecules. Preferably the purity is at least about 75%, more 



WO 98/26072 



PCTYUS97/22578 



preferably at least about 90%, and most preferably at least 
about 99% by weight pure. 

"Selective hybridization" refers to hybridization 
under conditions of high stringency. Hybridization of 
5 nucleic acid molecules depends upon factors such as the 
degree of complementarity, stringency of hybridization 
conditions, and the length of hybridizing strands. 

The term "stringency" relates to nucleic acid 
hybridization conditions. High stringency conditions 

10 disfavor non-homologous base pairing. Low stringency 

conditions have the opposite effect. Stringency may be 
altered, for example, by changes in temperature and salt 
concentration. Typical high stringency conditions comprise 
hybridizing at 50°C to 65°C in 5X SSPE and 50% formamide, 

15 and washing at 50°C to 65°C in 0 . 5X SSPE; typical low 

stringency conditions comprise hybridizing at 35°C to 37°C 
in 5X SSPE and 40% to 45% formamide and washing at 42°C in 
1X-2X SSPE. 

"SSPE" denotes a hybridization and wash solution 
20 comprising sodium chloride, sodium phosphate, and EDTA, at 
pH 7.4. A 20X solution of SSPE is made by dissolving 174 g 
of NaCl, 27.6 g of NaH 2 P04-H 2 0, and 7.4 g of EDTA in 800 ml 
of H2O. The pH is adjusted with NaOH and the volume brought 
to 1 liter. 

25 "SSC" denotes a hybridization and wash solution 

comprising sodium, chloride and sodium citrate at pH 7. A 20X 
solution of SSC is made by dissolving 175 g of NaCl and 88 g 
of sodium citrate in 800 ml of H 2 0. The volume is brought to 
1 liter after adjusting the pH with 10N NaOH. 

30 "Virulence gene" as used herein means a gene from 

a pathogenic organism such as S. pneumoniae that is required 
for infection and/or pathogenicity in vivo. Some virulence 
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ger.es are induced during infection of a host; others are 
expressed exclusively during in vivo infection. 

The Streptococcus pneumoniae genome contains about 2.2 
5 million nucleotide base pairs and comprises about 2000 to 
3000 ORFs and other genes. This invention provides, among 
other things, contiguous fragments, genes, and proteins from 
the S. pneumoniae genome (SEQ ID NO: 1 through SEQ ID 
NO:228) . 

10 Strain differences in S. pneumoniae may be associated 

with nucleotide sequence differences in one or more of the 
genomic fragments disclosed herein. Sequences that are 
substantially identical to the sequences disclosed herein 
are intended to be within the scope of the invention. 

15 The sequence fragments disclosed herein provide a wide 

variety of utilities. For example, the fragments may be used 
to identify regions of the S. pneumoniae genome that are 
expressed as proteins (viz. transcribed into mRNA) . The 
genomic fragments disclosed herein can also be used to 

20 examine differential expression of S. pneumoniae genes under 
diverse environmental conditions, as occurs, for example, 
with the expression of virulence genes during in vivo 
infection of a host organism. Also contemplated by the 
invention are: (1) preparation of molecular hybridization 

25 probes for use in physical mapping, sequencing, mutagenesis, 
mutation analysis, (2) homology comparisons of the sequences 
disclosed herein with the genomes and ORFs of other 
organisms, (3) creation of specifically mutated strains of 
S. pneumoniae wherein the mutation is targeted to any site 

30 in the DNA sequence disclosed herein, (4) identification of 
S. pneumoniae promoters and other gene regulatory sequences, 
(5) identification of proteins and RNAs encoded by S. 
pneumoniae, (6) amplification of 5. pneumoniae genes using 
the PCR, and (7) production of kits for isolating and 
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analyzing genes that are mutated in antibiotic resistant 
clinical isolates of S. pneumoniae . 

Genome Analysis 

In one embodiment, the invention comprises the ORFs and 
fragments thereof encoded by the nucleotide sequences 
disclosed herein. Some of the nucleotide sequences disclosed 
herein encode ORFs and fragments of ORFs (Table 1). The ORFs 
or fragments thereof were identified by translation of the 
nucleic acid sequences disclosed herein. The biological 
function of a protein disclosed in Table 1 was determined by 
homology comparison with known proteins from other 
organisms. A number of computer programs are available to 
assist in homology comparisons, for example Genemark 
(Borodovsky and Mclninch, Computers Chem. 17(2), 123, 1993). 

Computer-Related Applications 

The nucleotide and/or amino acid sequence information 
of this invention may be provided in a variety of media to 
facilitate use. In one embodiment the present invention 
comprises one or more of the sequences disclosed herein 
recorded on a computer readable medium. A variety of media 
are contemplated, for example, magnetic storage media such 
as floppy discs, hard disc storage, magnetic tape, and CD- 
ROM. A skilled artisan can readily adopt any presently known 
method for recording information on a computer readable 
medium to generate manufactures comprising the nucleotide or 
amino acid sequence information of the present invention. 
These embodiments are contemplated within the scope of this 
invention . 

The choice of a data storage structure will generally 
be based on the means chosen to access the stored 
information. A variety of data processor programs and 
formats can be used to store the sequence information of the 
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ir. vent ion on computer readable medium. For example, the 
sequence can be represented in a word processing text file 
that is formatted in commercially available software such as 
WordPerfect and Microsoft Word, or it can be represented in 
5 the form of a text only file such as ASCII. 

Having S. pneumoniae genomic sequence information in a 
computer readable format enables a skilled artisan to access 
the information for a variety of purposes. For example, 
computer-assisted searching algorithms may be used to 

10 identify open reading frames, and ascertain biological 
function based on homology to known proteins from other 
organisms. Suitable algorithms for sequence comparisons 
include BLAST (Altschul et aJ . , J. Mol . Biol. 215, 403-410, 
1990) and BLAZE (Brutlag et aJ . , Comp. Chem. 17, 203-207 

15 (1993) . For identification of ORFs a number of commercially 
available software programs are suitable, such as FRAMES 
(Genetic Center Group, Madison, WI) . 

The genomic information of this invention in computer- 
readable form can be manipulated further using 

20 bioinf ormatics to identify the biological function of 

proteins encoded by ORFs as well as the cellular location of 
said proteins. The skilled artisan will recognize several 
computer-assisted algorithms for this purpose, for example, 
PSORT which is useful for determining the likely location of 

25 a protein within a cell (See K. Nakai & M. Kanehisa. "Expert 
system for predicting protein localization sites in Gram- 
negative bacteria", Proteins: Structure, Function, and 
Genetics, 11, 95-110 (1991) . 



Open Reading Frames and Proteins 

The invention also provides proteins encoded by the S. 
pneumoniae genome in substantially purified form {See Table 
1) . The proteins are classified herein as (1) Hypothetical, 



WO 98/26072 



PCT/US97/22578 



-11- 

(2) Cell wall biosynthet ic, (3) External target, or (4) 
Minimal gene set proteins. 

Cells that carry knockout mutations in proteins of the 
hypothetical class are nonviable. Loss of viability suggests 
5 that these proteins may be essential for viability. Two such 
proteins, whose genes map to contigs mC14 and m016, 
correspond respectively to Haemophillus influenzae ORFs 
HI1146 and HI1648. Two other hypothetical proteins, yyaF and 
ywbL, correspond to a GTP binding protein and 

10 transcriptional regulator, respectively. 

The proteins of this invention can be used to raise 
antibodies. Antibodies against the hypothetical class of 
proteins are especially attractive. In targeting 
presumptively essential cellular functions, antibodies 

15 against "hypothetical proteins" could have therapeutic or 

prophylactic applications. Additionally, the "hypothetical" 
proteins can be used to screen for agents that bind or 
otherwise interact with said proteins. Such agents could 
lead to the identification of new antibacterial agents. 

20 Proteins classified in Table 1 as cell wall 

biosynthetic proteins, and external target proteins, were 
identified by homology with known proteins. These proteins 
are useful for identifying agents that bind and inhibit 
bacterial growth. Therefore, in another embodiment of the 

25 invention, the proteins of these classifications are 

prepared, preferably by recombinant means as described 
herein, substantially purified, and used in a screen to 
identify compounds that bind and/or inhibit the activity of 
said proteins. A variety of suitable screens are 

30 contemplated for this purpose. For example, the protein (s) 
can be labeled by known techniques such as radiolabeling or 
fluorescent tagging, or by labeling with biotin/avidin; 
thereafter binding of a test compound to a labeled protein 
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can be determined by any suitable means, well known to the 
skilled artisan. 

The proteins categorized as "minimal gene set" are 
homologous to a set of highly conserved proteins found in 
5 other bacteria. The minimal gene set proteins are thought to 
be essential for viability, and are useful targets for the 
development of new antibacterial compounds. 

DNA Chips and Applications 

10 The nucleic acids disclosed herein, or subfragments 

thereof, may be arrayed on any suitable solid surface, 
thereby constructing a "chip." DNA chip hybridizations 
provide greater sensitivity than do conventional 
hybridization means, such as Southern hybridization or 

15 Northern hybridization. DNA chips are useful for a variety 
of purposes, for example, in mutation and gene expression 
analysis, and in probing the structure, function, and 
expression of the genome. This aspect of the invention 
relates to any one or more of the DNA fragments disclosed 

20 herein, wherein said fragments are attached to a solid 

support (i.e. "chip" or "DNA chip" or "Bio chip"). Attachment 
of a nucleic acid to a support can be, but is not 
necessarily, accomplished by chemical or enzymatic means. 

In one embodiment, DNA fragments of this invention are 

25 arrayed onto a solid support as a means for assessing gene 

expression in S. pneumoniae. The DNA fragments attached to a 
chip may be of any size that is suitable for hybridization 
to other nucleic acid molecules such as cDNAs, genomic DNAs, 
or RNAs . Suitably-sized DNA fragments are from 10 nucleotide 

3 0 residues to approximately several thousand residues. The 
preferred length is about 50 to 500 nucleotides. 

Analysis of gene expression using the chips of this 
invention is assessed by hybridization of a chip to RNA 
samples, or cDNA samples prepared from S. pneumoniae grown 
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under any suitable conditions. Preferred samples for 
hybridization to a chip comprise cDNA. Methods for preparing 
RNA or cDNA are well known in the art. 

A variety of suitable methods are known for fixing DNA 
5 fragments to solid support matrices [See e.g. D. Stimpson et 
al. "Real-time detection of DNA hybridization and melting on 
oligonucleotide arrays by using optical wave guides" Proc. 
Nat. Acad. Sci. 92, 6379 (1995)] Preferred surfaces for 
producing a chip are glass or polystyrene. Convenient 

10 surfaces are microscope slides, or cover slips (Corning), 
treated with silicon or silane to minimize non-specific 
binding by DNA or proteins. Also suitable for this purpose 
are 96-well microtiter plates. 

A light-directed method may be used for attaching 

15 oligonucleotides, enabling nucleotide synthesis directly on 
the solid surface using photolabile 5'protected N-acyl- 
deoxynucleotide phosphoramidites and surface linker 
chemistry (See Pease et al. "Light-generated oligonucleotide 
arrays for rapid DNA sequence analysis" Proc. Nat. Acad. 

20 Sci. 91, 5022-5026, 1994). Alternatively, DNA fragments can 
be bound to a surface via interaction with a specific DNA 
binding protein. Any suitable DNA binding protein may be 
used, for example bacteriophage DNA binding proteins, 
Adenovirus binding protein, the E. coli lac-repressor 

25 protein, or 1-repressor protein. DNA binding proteins are 
attached to the surface of a chip by covalent chemical 
binding, essentially as described in U.S. Patent 5,561,071, 
the entire contents of which is incorporated by reference. 
The latter method requires that DNA fragments contain a 

3 0 recognition sequence that enables binding by the DNA binding 
protein. Specific sequences for a number of DNA binding 
proteins are known. Methods for incorporating specific 
binding sequences into the genomic DNA fragments disclosed 
herein are well known in the cloning arts. 



WO 98/26072 



PCT/US97/22578 



-14- 

DNA chip technology enables monitoring S. pneumoniae 
gene expression on a genome-wide level. This feature of the 
invention is particularly attractive for identifying (1) 
genes that are expressed or not expressed during the life 
5 cycle or infection cycle of S. pneumoniae, and (2) changes 
in gene expression that correlate with environmental change. 

For example, virulence genes in S. pneumoniae can be 
identified by the DNA chip method disclosed herein. 
Identification of virulence genes in S. pneumoniae will 

10 provide new targets for developing novel antibiotics. For 
this aspect of the invention any suitable encapsulated 
strain of S. pneumoniae is introduced into a mouse, for 
example, by intraperitoneal injection, or by introduction 
directly into the lungs, or by any other suitable method. 

15 Approximatly 2 days after infection a peripheral blood titre 
level is reached of about 10 8 S. pneumoniae cells/ml. Cells 
recovered from peripheral blood, or other suitable tissue, 
are used in identifying virulence genes. For this purpose, 
cDNAs are prepared from cells recovered from an in vivo 

20 infection and from cells grown in vitro. After labeling, the 
cDNAs are hybridized against the DNA chip(s) disclosed 
herein. Genomic fragments that hybridize to the in vivo 
probe but not to the in vitro probe identify candidate 
virulence genes. 

25 Also contemplated by this aspect of the invention is a 

method for analyzing gene expression in 5. pneumoniae cells 
grown or harvested from any desireable in vitro or in vivo 
environment, wherein said environment may include compounds 
whose effects on gene expression are to be determined. 

30 In another embodiment, the present invention relates to 

a DNA bio-chip, useful for correlating DNA sequence with 
biological function. The bio-chip comprises an array of the 
genomic DNA fragments disclosed herein, or portions thereof, 
attached to the surface of any suitable solid support 
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material. The bio chip further comprises a layer of 
competent S. pneumoniae cells suspended over the DNA array 
in any suitable semi-solid medium such as agar or agarose. 
The cells suspended on the bio chip comprise known or 
5 unknown mutant strains, or they may be wild-type cells. The 
cell layer is in contact with the DNA matrix such that DNA 
on the chip can be taken up by the cells. 

The bio-chip is useful for several purposes. For 
example, the bio-chip can be used to localize an unknown 

10 mutation to a specific region of the genome by 

complementation. The bio-chip enables correlating a 
phenotype with a genetic locus. For example, mutant cells 
harboring one or more mutations and having at least one 
screenable or selectable phenotype can be applied to a bio 

15 chip and subjected to an environment that allows for 

selection, or for screening by complementation. If said 
phenotype is the result of a chromosomal mutation or 
mutations that map to a genomic fragment present on the 
chip, DNA uptake by the cells and repair of the mutation by 

20 recombination will be identifiable by a suitable screen or 
selection . 

In a preferred embodiment, the bio-chip is overlayed 
with competent S. pneumoniae cells. Methods for preparing 
competent cells are known (See e.g. LeBlanc et.aJ. Plasmid 
25 28, 130-145, 1992; Pozzi et a J . J. Bacterid . 11 8 , 6087-6090, 
1996) . 

Other embodiments of this aspect of the invention are 
contemplated. For example the genomic fragments disclosed 
herein could be prepared and dispensed into individual wells 
30 of a 96-well micro titre plate. Competent S. pneumoniae 
cells could then be added to the wells under conditions 
suitable for DNA uptake followed by plating onto any 
suitable selection or screening medium, for example an agar 
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plate containing suitable growth, and/or selection/screening 
components . 

Diagnostic Kits and Assays 
5 The present invention further relates to kits and 

assays that can be used for rapid and efficient detection of 
S. pneumoniae cells. Also contemplated are kits for 
detecting mutations carried by 5. pneumoniae cells. Kits of 
this nature are particularly attractive in the clinical 

10 environment where knowledge about the identity of a pathogen 
and/or of the basis for resistance to antibiotic treatments 
is essential for effective medical treatment. In the long 
term, knowledge of the mutations that lead to resistance 
will enable the design of new antibacterial agents. 

15 A kit for detecting S. pneumoniae cells can be based on 

antibody recognition of S. pneumoniae specific antigens or 
epitopes, or by nucleic acid hybridization techniques for 
the detection of S. pneumoniae specific nucleic acid 
molecules . 

20 A variety of embodiments are contemplated in this 

aspect of the invention. In one embodiment a kit is provided 
for detecting mutations in drug-resistant S. pneumoniae. For 
this purpose, DNA is prepared from a resistant isolate and 
from a wild-type strain. In a preferred embodiment, the 

25 polymerase chain reaction (i.e. PCR) is used to amplify DNA 
samples representing any one or all of the genomic fragments 
disclosed herein. The amplified DNAs from the mutant and 
wild-type cells are hybridized to a DNA chip having fixed 
thereon any one or more of the genomic fragments disclosed 

30 herein. Amplified DNA samples from the mutant and wild-type 
strain are labeled by any suitable means, for example using 
radioisotopes or fluorescent labeling. Hybridization of the 
amplified DNAs to the chip under conditions that can 
discriminate single or multiple base pair mismatches enables 
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the detection of differences between the mutant and wild- 
type samples. This method identifies a specific fragment of 
the genome that is altered in the mutant strain. The 
specific mutation can be determined by conventional DNA 
5 sequence analysis. 

This aspect of the invention also relates to the 
detection of S. pneumoniae proteins in a sample using 
antibody molecules raised against any suitable ORF disclosed 
herein. Antibody detection methods are well known to those 

10 skilled in the art including, for example, a variety of 

radioimmunological assays. (See e.g. P. Tijssen, Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in 
Biochemistry and Molecular Biology , Elsevier Science 
Publishers, Amsterdam, The Netherlands, 1985) . 

15 Test samples suitable for use in this aspect of the 

invention include but are not limited to biological fluids 
such as sputum, blood, serum, plasma, urine, and to biopsy 
samples . 

Skilled artisans will recognize that the disclosed 
20 method and reagents can be readily incorporated into a kit. 
For example, a kit would contain one or more receptacles 
comprising one or more of the following: PCR reagents, DNA 
chip reagents, labeling reagents, assorted buffers, and/or 
antibodies . 

25 

Production of Antibodies 

The proteins of this invention and fragments 
thereof may be used in the production of antibodies. The 
term "antibody" as used herein describes antibodies, 
30 fragments of antibodies (such as, but not limited, to Fab, 
Fab', Fab2 ' , and Fv fragments), and chimeric, humanized, 
veneered, resurfaced, or CDR-grafted antibodies capable of 
binding antigens of a similar nature as the parent antibody 



WO 98/26072 



PCT/US97/22578 



molecule from which they are derived. The instant invention 
also encompasses single chain polypeptide binding molecules. 

The production of antibodies, both monoclonal and 
polyclonal, in animals is well known in the art. See, e.g., 
5 C. Milstein, Handbook of Experimental Immunology , (Blackwell 
Scientific Pub., 1986); J. Goding, Monoclonal Antibodies: 
Principles and Practice , (Academic Press, 1983) . For the 
production of monoclonal antibodies the process begins with 
injecting a mouse, or other suitable animal, with an 

10 immunogen. The mouse is subsequently sacrificed and cells 
taken from its spleen are fused with myeloma cells, 
resulting in a hybridoma that can be cultured in vitro. 
Hybridomas are screened for clones that secrete a single 
antibody species, specific for the immunogen. 

15 Chimeric antibodies, described in U.S. Patent No. 

4,816,567, herein incorporated by reference, teaches methods 
and vectors for preparing chimeric antibodies. An 
alternative approach is provided in U.S. Patent No. 
4,816,397, the entire contents of which is herein 

20 incorporated by reference. This patent teaches co- 
expression of heavy and light chains in the same host cell. 

The method taught in U.S. Patent 4,816,397 has 
been further refined in European Patent Publication No. 0 
239 400. The teachings of this publication are preferred for 

2 5 engineering monoclonal antibodies. In this technology the 
complementarity determining regions (CDRs) of a human 
antibody are replaced with the CDRs of a murine monoclonal 
antibody, thereby converting the specificity of the human 
antibody to the specificity of the murine antibody. 

30 Single chain antibodies and libraries thereof 

provide yet another means for genetically engineering 
antibody molecules . (See, e.g. R.E. Bird, et a J . , Science 
242:423-426 (1988); PCT Publication Nos . WO 88/01649, WO 
90/14430, and WO 91/10737. Single chain antibody technology 
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involves covalently joining the binding regions of heavy and 
light chains thereby generating a single polypeptide chain 
having the binding specificity of an intact antibody 
molecule . 

The antibodies contemplated by the present invention 
are useful in diagnostics, therapeutics, or in 
diagnostic/therapeutic combinations . 

The proteins of this invention, or suitable fragments 
thereof, can be used to generate polyclonal or monoclonal 
antibodies, and various inter-species hybrids, or humanized 
antibodies, or antibody fragments, or single-chain 
antibodies. The techniques for producing antibodies are well 
known to skilled artisans. (See e.g. A.M. Campbell, 
Monoclonal Antibody Technology: Laboratory Techniques in 
Biochemsitry and Molecular Biology , Elsevier Science 
Publishers, Amsterdam (1984); Kohler and Milstein, Nature 
256, 495-497 (1975) ; Monoclonal Antibodies: Principles & 
Applications Ed. J. R. Birch & E.S. Lennox, Wiley-Liss, 1995. 

A protein or peptide to be used as an immunogen may be 
administered in an adjuvant by subcutaneous or 
intraperitoneal injection into, for example, a mouse or a 
rabbit. For the production of monoclonal antibodies, spleen 
cells from immunized animals are removed, fused with myeloma 
cells, such as SP2/0-Agl4 cells, and allowed to become 
monoclonal antibody producing hybridoma cells in the manner 
known to the skilled artisan. Hybridomas that secrete the 
desired antibody molecule can be screened by a variety of 
well known methods, for example ELISA assay, western blot 
analysis, or radioimmunoassay (Lutz et a J . Exp. Cell Res. 
175, 109-124 (1988); Monoclonal Antibodies: Principles & 
Applications Ed. J. R. Birch & E.S. Lennox, Wiley-Liss, 1995) . 

For some applications it is desireable to have an 
antibody labeled in some fashion. Procedures for labeling 
antibody molecules with radioisotopes, affinity labels, such 
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as biotin or avidm, enzymatic labels, for example 
horseradish peroxidase, and fluorescent labels such as FITC 
or rhodamine, are widely known (See e.g. Enzyme -Media ted 
Immunoassay , Ed. T. Ngo, K. Lenhoff, Plenum Press 1985; 
5 Principles of Immunology and Immunodiagnostics , R.M. Aloisi, 
Lea & Febiger, 1988) . 

Labeled antibodies are useful for a variety of 
diagnostic applications. In one embodiment, the present 
invention relates to the use of labeled antibodies to detect 

10 the presence of S. pneumoniae cells and proteins. Also 
contemplated are applications that use antibodies, 
preferably single chain antibodies, directed against a S. 
pneumoniae protein. Proteins identified as "external 
targets" are preferred for the generation of single chain 

15 antibodies. Single chain antibody libraries directed against 
S. pneumoniae surface proteins and cell wall proteins can be 
produced by applying the phage display technigue to crude 
membrane preparations. Antibodies that recognize and bind to 
external target proteins and/or cell wall proteins could be 

20 used as therapeutic agents to inhibit the growth of S. 

pneumoniae . Alternatively, the antibodies could be used in a 
screen to identify potential inhibitors of an external 
target protein. For example, in a competitive displacement 
assay, an antibody or compound to be tested is labeled by 

25 any suitable method. Competitive displacement of an antibody 
from an antibody-antigen complex by a test compound provides 
a means to identify new antibacterial compounds. 

Protein Production Methods 
30 The present invention relates further to 

substantially purified proteins encoded by the ORFs 
disclosed herein { SEQ ID NO: 87 through SEQ ID NO:228) . 

Skilled artisans will recognize that proteins can 
be synthesized by different methods, for example, chemical 
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methods or recombinant methods, as described in U.S. Patent 
4,617,149, hereby incorporated by reference. 

The principles of solid phase chemical synthesis 
of polypeptides are well known in the art and may be found 
5 in general texts relating to this area. See, e.g., K. Dugas 
and C. Penney, Bioorganic Chemistry (1981) Springer-Verlag, 
New York, 54-92. Peptides may be synthesized by solid-phase 
methodology utilizing an Applied Biosystems 430A peptide 
synthesizer (Applied Biosystems, Foster City, CA) and 

10 synthesis cycles supplied by Applied Biosystems. Protected 
amino acids, such as t-butoxycarbonyl-protected amino acids, 
and other reagents are commercially available from many 
chemical supply houses. 

The proteins and peptides of the present invention 

15 can also be made by recombinant DNA methods. Recombinant 
methods are preferred if a high yield is desired. 
Recombinant methods involve expressing a cloned ORF/gene in 
a suitable host cell. A gene is introduced into a host cell 
by any suitable means, well known to those skilled in the 

2 0 art. While chromosomal integration of a cloned gene is 

within the scope of the present invention, it is preferred 
that a cloned gene be maintained extra-chromosomally, as 
part of a vector wherein the gene is in operable-linkage to 
a constitutive or inducible promoter. 

25 Recombinant methods are also useful in 

overproducing a membrane -bound or membrane-associated 
protein. In some cases, membranes prepared from recombinant 
cells that overexpress such proteins provide an enriched 
source of the protein. Such membranes are useful for 

30 evaluating the function of the protein and/or for evaluating 
inhibitors of the protein. 



WO 98/26072 



PCT/US97/22578 



-22- 

Expressinq Recombinant Proteins in Proca rvotic and 
Eucaryotic Host Cells 

Procaryotes are generally used for cloning DNA 
sequences and for constructing vectors. For example, the 
Escherichia coli K12 strain 294 (ATCC No. 31446) is 
particularly useful for expression of foreign proteins. 
Other strains of E. coli, bacilli such as Bacillus subtilis, 
enterobacteriaceae such as Salmonella typhimurium or 
Serratia marcescans, various Pseudomonas species may also be 
employed as host cells in cloning and expressing the 
recombinant proteins of this invention. Also contemplated 
are various strains of Streptococcus and Streptocmyces . 

For effective expression of a recombinant protein 
a gene or ORF may be linked to a known promoter sequence. 
Suitable bacterial promoters include b -lactamase [e.g. 
vector pGX2907, ATCC 39344, contains a replicon and b - 
lactamase gene], lactose systems [Chang et al . , Nature 
(London), 275:615 (1978); Goeddel et al . , Nature (London), 
281:544 (1979)], alkaline phosphatase, and the tryptophan 
(trp) promoter system [vector pATHl (ATCC 37695) ] designed 
for the expression of a trpE fusion protein. Hybrid 
promoters such as the tac promoter (isolatable from plasmid 
pDR540, ATCC-37282) are also suitable. Promoters for use in 
bacterial systems also will contain a Shine-Dalgarno 
sequence operably linked to the DNA encoding the desired 
polypeptides. These examples are illustrative rather than 
limiting. 

A variety of mammalian cell systems and yeasts are 
also suitable host cells. The yeast Saccharomyces 
cerevisiae is a commonly used eucaryotic microorganism. 
Other yeasts such as Kluyveromyces lactis are also suitable. 
For expression of recombinant genes in Saccharomyces, the 
plasmid YRp7 (ATCC-4 0053 ) , for example, may be used. See, 
e.g., L. Stinchcomb, et al., Nature, 282:39 (1979); J. 
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Kingsraan et a J . , Gene, 7:141 (1979); S. Tschemper et al. r 
Gene, 10:157 (19SC). Plasmid YRp7 contains the TRP1 gene 
that provides a selectable marker in a trpl mutant. 

5 Purification of Recombinantly-Produced Protein 

An expression vector carrying an ORF of the 
present invention is transformed or transfected into a 
suitable host cell using standard methods. Cells which 
contain the vector are propagated under conditions suitable 

10 for expression of the encoded protein. If the gene is under 
the control of an inducible promoter then suitable growth 
conditions would incorporate the appropriate inducer. The 
recombinantly-produced protein may be purified from cellular 
extracts of transformed cells by any suitable means. 

15 In a preferred process for protein purification a 

gene/ORF is modified at the 5' end, or some other position, 
to incorporate a plurality of histidine residues at the 
amino terminus of the encoded protein. The "histidine tag" 
produced thereby enables a single-step protein purification 

20 method referred to as "immobilized metal ion affinity 

chromatography" (IMAC), essentially as described in U.S. 
Patent 4,569,794, hereby incorporated by reference. The IMAC 
method enables rapid isolation of substantially pure protein 
starting from a crude cellular extract. 

25 As skilled artisans will recognize, the proteins 

of the invention can be encoded by a multitude of different 
nucleic acid sequences owing to the degeneracy of the 
genetic code. The present invention further comprises these 
alternate nucleic acid sequences. 

3 0 The ribonucleic acid compounds of the present 

invention may be prepared using the polynucleotide synthetic 
methods discussed supra, or they may be prepared 
enzymatically using RNA polymerase to transcribe a DNA 
template . 
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Th e most preferred systems for preparing the 
ribonucleic acids of the present invention employ the RNA 
polymerase from the bacteriophage T7 or the bacteriophage 
SP6. These RNA polymerases are highly specific, requiring 
5 the insertion of bacteriophage-speci f ic sequences at the 5' 
end of the template to be transcribed. See, J. Sambrook, et 
al., supra, at 18.82-18.84. 

This invention also provides nucleic acids, RNA or 
DNA, which are complementary to the sequences disclosed 
10 herein. 

The present invention also provides probes and 
primers useful for a variety of molecular biology techniques 
including, for example, hybridization screens of genomic or 
subgenomic libraries, detection and quantification of mRNA 

15 species as a means to analyzing gene expression, and 

amplification of any region of the Streptococcus pneumoniae 
genome disclosed by the sequences herein. A nucleic acid 
compound is provided comprising any of the sequences 
disclosed herein, or a complementary sequence thereof, or a 

20 fragment thereof, which is at least 15 base pairs in length, 
and which will hybridize selectively to Streptococcus 
pneumoniae DNA or mRNA. Preferably, the 15 or more base pair 
compound is DNA. A probe or primer length of at least 15 
base pairs is dictated by theoretical and practical 

25 considerations. See e.g. B. Wallace and G. Miyada, 

"Oligonucleotide Probes for the Screening of Recombinant DNA 
Libraries," In Methods in Enzymology , Vol. 152, 432-442, 
Academic Press (1987) . 

The probes and primers of this invention can be 

30 prepared by methods well known to those skilled in the art 
(See e.g. Sambrook et aJ . supra). In a most preferred 
embodiment these probes and primers are synthesized by the 
polymerase chain reaction (PCR) . 
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The present invention also relates to recombinant 
DNA cloning vectors and expression vectors comprising the 
nucleic acids of the present invention. Preferred nucleic 
acid vectors are those which comprise DNA. The skilled 
5 artisan understands that choosing the most appropriate 

cloning vector or expression vector depends on a number of 
factors including the availability of restriction enzyme 
sites, the type of host cell into which the vector is to be 
transfected or transformed, the purpose of the transfection 

10 or transformation (e.g., stable transformation as an 

extrachromosomal element, or integration into the host 
chromosome) , the presence or absence of readily assayable or 
selectable markers (e.g., antibiotic resistance and 
metabolic markers of one type and another) , and the number 

15 of gene copies desired in the host cell. 

Vectors suitable to carry the nucleic acids of the 
present invention comprise RNA viruses, DNA viruses, lytic 
bacteriophages, lysogenic bacteriophages, stable 
bacteriophages, plasmids, viroids, and the like. The most 

20 preferred vectors are plasmids. 

Host cells harboring the nucleic acids disclosed 
herein are also provided by the present invention. A 
preferred host is E. coli which has been transfected or 
transformed with a vector that comprises a nucleic acid of 

25 the present invention. 

The present invention also provides a method for 
constructing a recombinant host cell capable of expressing 
an ORF disclosed herein, said method comprising transforming 
or otherwise introducing into a host cell a recombinant DNA 

3 0 vector that comprises an isolated DNA sequence which encodes 
said ORF. The preferred host cell is any strain of E. coli 
which can accomodate high level expression of an exogenously 
introduced gene. Transformed host cells are cultured under 
conditions well known to skilled artisans such that said ORF 
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is expressed, thereby producing the encoded protein in the 
recombinant host cell. 

For the purpose of discovering new inhibitors of 
cell wall biosynthesis, it would be desirable to determine 
agents that inhibit enzymes required for synthesis of the 
cell wall and/or agents that interact with membrane 
proteins. A method for identifying compounds that interact 
with such enzymes and membrane proteins comprises contacting 
said proteins with a test compound and monitoring an 
interaction and/or inhibition by any suitable means. 

The instant invention provides a screening system 
for compounds that interact with membrane proteins of this 
invention, said screening system comprising the steps of: 

a) preparing a membrane protein, or membranes 
enriched in said protein; 

b) exposing the protein source of (a) to a test 
compound; and 

c) quantifying the interaction of said protein with 
said compound by any suitable means. 

The screening method of this invention may be 
adapted to automated procedures such as a PANDEX® (Baxter- 
Dade Diagnostics) system, allowing for efficient high-volume 
screening of compounds. 

In a typical screening protocol, a protein to be 
tested is prepared as described herein, preferably using 
recombinant DNA technology. A test compound is introduced 
into a reaction vessel containing said protein. The 
reaction/interaction of said protein and said compound is 
monitored by any suitable means. For example, a 
radioactively-labeled or chemically-labeled compound or 



WO 98/26072 



PCT/US97/22578 



protein may be used. Specific association between a test 
compound and protein is monitored by any suitable means. 

The following examples more fully describe the 
present invention. Those skilled in the art will recognize 
5 that the particular reagents, equipment, and procedures 
described are merely illustrative and are not intended to 
limit the present invention in any manner. 

EXAMPLE 1 

10 Vector for Expressing S. pneumoniae ORF in a Host Cell 

An expression vector suitable for expressing a 5. 
pneumoniae gene or fragment thereof in a variety of 
procaryotic host cells, such as E. coli, is easily made. A 
suitable parent vector contains an origin of replication 

15 (Ori), a marker for selecting transf ormants, for example, an 
ampicillin resistance gene (Amp) , and further comprises 
suitable transcriptional and translational signals, for 
example, the T7 promoter and T7 terminator sequences, in 
operable-linkage to a 5. pneumoniae coding region. For 

20 example, pETHA (obtained from Novogen, Madison WI) is 

linearized by restriction with endonucleases Ndel and BamHI . 
Linearized pETHA is ligated to a DNA fragment bearing Ndel 
and BamHI sticky ends and comprising a coding region for a 
S. pneumoniae ORF. 

25 The ORF used in this construction may be modified 

at the 5' end (amino terminus of encoded protein or peptide) 
to simplify purification of the encoded protein or peptide. 
For this purpose, an oligonucleotide encoding 8 histidine 
residues is inserted after the transcriptional and 

30 translational start sites. Placement of the histidine 
residues at the amino terminus of the encoded protein 
enables the IMAC one-step protein purification procedure. 



Example2 
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Recombinant Expression and Purification of a Protein Encoded 
by a 5. pneumoniae OP.F 
An expression vector that carries an ORF from the 
5. pneumoniae genome, as disclosed in Example 1, and which 
ORF is operably-linked to an expression promoter, is 
transformed into E. coli BL21 (DE3) (hsdS gal lclts857 
indlSam7nin5JacUV5-T7gene 1) using standard methods. 
Transf ormants , selected for resistance to ampicillin, are 
chosen at random and tested for the presence of the vector 
by agarose gel electrophoresis using quick plasmid 
preparations. Colonies that contain the vector are grown in 
L broth and the protein produced by the vector-borne ORF is 
purified by I MAC, essentially as described in US Patent 
4, 569, 794 . 

Briefly, the IMAC column is prepared as follows. A 
metal-free chelating resin (e.g. Sepharose 6B IDA, 
Pharmacia) is washed in distilled water to remove 
preservatives and then infused with a suitable metal ion 
[e.g. Ni(II), Co(II), or Cu(II)] by adding a 50mM metal 
chloride or metal sulfate aqueous solution until about 75% 
of the interstitial spaces of the resin are saturated with 
colored metal ion. The column is then ready to receive a 
crude cellular extract containing the recombinant protein 
product . 

Unbound proteins and other materials are removed by 
washing the column with any suitable buffer, pH 7.5. Bound 
protein is eluted in any suitable buffer at pH 4.3, or 
preferably with an imidi zole-containing buffer at pH 7.5. 

Example 3 
DNA Chip Production 
Any one or more of the S. pneumoniae genome DNA 
fragments disclosed herein, or fragments thereof , are arrayed 
onto a solid support. It is preferred that fragments be in 
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zhe size range of 14 base pairs to 500 base pairs. The DKA 
samples are most conveniently synthesized by PGR using 
standard methods to amplify regions disclosed by the genomic 
sequences herein. The method of Schena et al . is used to 
5 spot about 1 ng to 10 ng of a DNA sample onto glass 

microscope slides that have been treated with poly-L-lysine 
(M. Schena et al. "Quantitative monitoring of gene 
expression patterns with a complementary DKA microarray" 
Science, 270, 467-470, 1995). After spotting DNA samples 
10 onto the chip and air-drying, the chips are rehydrated by 
incubation for about 2 hours in a humid chamber. Chips are 
then placed at 100° C for 1 minute, rinsed in 0.1% SDS, and 
treated with 0.05% succinic anhydride in 50% l-methyl-2- 
pyrrolidinone and 50% boric acid. 

15 

Example 4 

S. pneumoniae Gene Expression Analysis using DNA Chips 
RNA prepared from cells grown under any desireable 
conditions is used to prime cDNA synthesis by reverse 

20 transcription, using methods well known to the skilled 

artisan (See e.g. Molecular Cloning , 2d Ed. J.Sambrook, E. 
Fritsch, T. Maniatis, 1989). For example, total RNA of 
strain R6 is prepared according to the method of Logeman 
et.al., {Analytical Biochemistry, 1987, 163, 16-20) using 

25 guanidine hydrochloride. After ethanol precipitation, the 

total RNA is dissolved in a buffered solution such as Tris- 
EDTA (TE) . Complementary DNA 1 s are synthesized with the aid 
of the StrataScript RT-PCR kit (Stratagene, Inc . ) in 
accordance with the supplier's recommendations (See Schena 

3 0 et al. Id.). Briefly, a 50 ul reaction contains about 0.1 

ug/ul of RNA. First strand synthesis is primed using random 
primers, IX first strand buffer, 0.03 U/ul ribonuclease 
block, 500 uM dATP, 500 uM dTTP, 40 uM dGTP, 40 uM 
f luorescein-12-dCTP (New England Nuclear), and 0.03 U/ul 
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reverse transcriptase. Reactions are incubated for 60 
minutes at 37° C, precipitated with ethanol, and resuspended 
in 10 ul TE pH 8. Samples are heated for 3 minutes at 94° C 
and chilled on ice. The RNA is degraded by adding 0.25 ul of 
10 N KaOH , followed by a 10 minute incubation at 37° C. The 
samples are neutralized with 2.5 ul of 1M Tris-HCl, pH 8 and 
0.25 ul of 10 N HC1. After ethanol precipitation, the 
nucleic acid pellet is washed and dried in vaccuo. 

Prior to hyrbrization, DNA chips prepared as in Example 
3 are denatured by heating to 90°C for 2 minutes. 
Hybridization reactions contain about 1 ul of f luorescently- 
labeled cDNA, and 1 ul of hybridization buffer (lOx SSC and 
0.2% SDS) . Probe mixtures are transferred to the surface of 
the chip, covered with a cover slip, and incubated for 18 
hours at 65° C. Chips are washed 5 minutes at room 
temperature in IX SSC, 0.1% SDS, then for 10 minutes at room 
temperature in 0.1X SSC, 0.1% SDS. After hybridization, 
chips are scanned with a laser-scanning device. 
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Example 5 

A DNA Bic Chip for mutation analysis 
Duplicate DNA chips are prepared as in Example 3. Each 
chip is overlayed with S. pneumoniae cells in a semi-solid 
5 medium., wherein said cells carry a temperature-sensitive 
(ts) mutation in a gene required for autolytic activity 
(Lyt~) . This mutation leads to resistance to lysis at 37° C, 
but sensitivity to lytic treatments at 30° C. 

S. pneumoniae strain cwl is resistant to lysis by 

10 detergent and penicillin when grown at 37° C, but remains 
sensitive when grown at 30° C (cwl is derived from strain 
R6; See P. Garcia et aJ . "Mutants of Streptococcus 
pneumoniae that contain a temperature-senstive autolysin" J. 
Gen. Microbiol. 132, 1401-05, 1986) . Strain cwl is grown at 

15 30° C and competent cells are prepared according to any 

suitable method (e.g. LeBlanc et.al. Plasmid 28, 130-145, 
1992; Pozzi et al . J. BacterioJ . 178 , 6087-6090, 1996). 
Competent cwl cells are harvested by centri f ugat ion and 
resuspended at about 10 5 cells per ml in 1% melted agar 

20 supplemented with 0.1% (w/v) yeast extract (Difco) and 

containing 1% to 2% Triton X-100. Approximately 100 ul to 
500 ul of the cell mixture is deposited per square 
centimeter onto the bio chip by pipetting onto the chip 
surface. After solidification of the agar layer, one of the 

25 bio-chips is incubated at 37° C and the other at 30° C. 

Cells that take up a complementing genomic DNA fragment from 
the chip surface will be lysed at both 30° C and 37° C, 
while non-complemented cells are lysed only at 30° C. Cells 
that are complemented by the bio-chip are recognizable by 

30 this phenotypic difference and can be further purified by 
well known methods. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Baltz, Richard H. 

Burgett, Stanley G. 
DeHoff, Bradley S. 
Jaskunas Jr., Stanley R. 
Mills, Bradley J. 
Korris, Franklin H. 
Peery, Robert B. 
Rosteck Jr., Paul R. 
Skatrud, Paul L. 
Smith, Michele C. 
Rockey, Pamela K. 
Young-Bellido, Michele 

(ii) TITLE OF INVENTION: Streptococcus Pneumoniae DNA Sequences 

(iii) NUMBER OF SEQUENCES: 122 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Eli Lilly and Company 

(B) STREET: Lilly Corporate Center 

(C) CITY: Indianapolis 

(D) STATE: Indiana 

(E) COUNTRY: U.S. 

(F) ZIP: 46285 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Webster, Thomas D. 

(B) REGISTRATION NUMBER: 39,872 

(C) REFERENCE/DOCKET NUMBER: X-11162 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 317-276-3334 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATGGTGGAAG TTCCAGATGA ACGCCTACAA AAACTAACTG AAAT GAT AAC TCCTAAAAAG 60 

ACAGTTCCCA CAACATTTGA ATTTACAGAT ATTGCAGGGA TTGTAAAAGG AGCTTCAAAA 12 0 

GGAGAAGGGC TAGGGAATAA ATTCTTGGCC AATATTCGTG AAGTAGATGC GATTGTTCAC 18 0 

GTAGTTCGTG CTTTTGATGA TGAAAATGTA ATGCGCGAGC AAGGACGTGA AGACGCCTTT 240 

GTAGATCCAC TTGCAGATAT TGATACAATT AATCTGGAAT TAATTCTTGC TGACTTAGAA 300 

TCAGTGAACA AAC GAT AT GC GCGTGTAGAA AAGATGGCAC GTACGCAAAA AGATAAAGAA 360 

TCAGTAGCAG AATTCAATGT TCTTCAAAAG AT TAAAC C AG TCCTAGAAGA CGGGAAATCA 42 0 

GCTCGTACCA TT GAATTTAC A GAT GAGGAA CAAAAGGTTG TCAAAGGTCT TTTCCTTTTG 48 0 

ACGACTAAAC CAGTTCTTTA TGTAGCTAAT GTGGACGAGG ATGTGGTTTC AGAACCTGAC 54 0 

TCTATCGACT ATGTCAAACA AATTCGTGAA TTTGCAGCGA CAGAAAATGC TGAAGTAGTC 600 

GTTATTTCTG CGCGTGCTGA GGAAGAAATT TCTGAATTGG ATGATGAAGA TAAAAAAGAG 66 0 

TTTCTTGAAG CCATTGGTTT GACAGAATCA GGTGTAGATA AGTTGACGCG TGCAGCTTAC 72 0 

CACTTGCTTG GATTGGGAAC TTACTTCACA GCTGGTGAAA AAGAAGTTCG CGCTTGGACT 780 

TTCAAACGTG GTATGAAGGC TCCTCAAGCA GCTGGTATTA TCCACTCAGA CTTTGAAAAA 84 0 

GGCTTTATTC GTGCAGTAAC CATGTCATAT GAAG AT C TAG TGAAATACGG ATCTGAAAAG 900 

GCCGTAAAAG AAGCTGGACG CTTGCGTGAA GAAGGAAAAG AATATAT CGT TCAAGATGGC 9 60 

GATAT CAT GG AATTCCGCTT TAATGTCTAA AAATTAATAA ATGGTGTCAA TTAGGTTGGA 1020 

AAAAAATTCC AACCCTTTTG GCTTTTGAAA GGAAAAATAA ATGACCAAAT TACTTGTAGG 10 8 0 

TTTGGGAAAT CCAGGGGATA AATATTTTGA AACAAACACA ATGTTGGTTT TATGTTGATT 114 0 

GATCAACTAG CGAAGAAACA GAATGTCACT TTTACACACG ATAAGATATT TCAAGCTGAC 12 00 

CTAGCATCCT TTTTCCTAAA TGGAGAAAAA ATTTATCTGG GTTAAACCAA CGACCTTTAT 12 60 

GGATTGA 1267 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1255 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

10 

TGGTCCGTGG TGCTGAGGAC CCTTAGAGTT CGAGTACCAC AAGGTACCGA CTGTTCGTGA 60 

TGCGGAGCTG GCAAGGTTTT AACAGATTTG ATTGAACATG GGCAAGAATT TATCGTTGCC 12 0 

15 CACGGTGGTC GTGGTGGACG T G GAAAT AT T CGTTTCGCGA CACCAAAAAT CCTGCACCGG 180 

AATCTCTGAA AAT GGAGAAC CAGGTCAGGA ACGTGAGTTA CAATTGGAAC TAAAAATCTT 24 0 

GGCAGATGTC GGTTTAGTAG GATTCCCATC TGTAGGGAAG TCAACACTTT TAAGTGTTAT 300 

20 

TACCTCAGCT AAGCCTAAAA TTGGTGCCTA CCACTTTACC ACTATTGTAC CAAATTTAGG 360 

TATGGTTCGC ACCCAATCCA GGTGAATCCT TTGCAGTAGC CGACTTGCCA GGTTTGATTG 42 0 

2 5 AAGGGGCTAG TCCAAGGTGT TGGTTTGGGA ACTCAGTTCC TCCGTCACAT CGAGCGTACA 480 

CGTGTTATCC TTCACATCAT TGATATGTCA GCTAGCGAAG GCCGTGATCC ATATGAGGAT 54 0 

TACCTAGCTA TCAATAAAGA GCTGGAGTCT TACAATCTTC GCCTCATGGA GCGTCCACAG 60 0 

30 

ATTATTGTAA CTAATAAGAT GGACATGCCT GAGAGTCAGG AAAATCTTGA AGAATTTAAG 660 

AAAAAATTGG CTGAAAATTA T GAT GAATTT GAAGAGTTAC CAGCTATCTT CCCAATTTCT 72 0 

35 GGATTGACCA AGCAAGGTCT GGCAACACTT TTAGATGCTA CAGCTGAATT GTTAGACAAG 7 80 

ACACCAGAAT TTTTGCTCTA CGACGAGTCC GATATGGAAG AAGAAGTTTA CTATGGATTT 84 0 

GACGAAGAAG AAAAAGCCTT TGAAATTAGT CGTGATGACG ATGCGACATG GGTACTTTCT 9 00 

40 

GGTGAAAAAC T CAT GAAACT CTTTAATATG ACCAACTTTG ATCGTGATGA ATCTGTCATG 9 60 

AAATTTGCCC GTCAGCTTCG TGGTATGGGG GTTGAT GAAG CCCTTCGTGC GCGTGGAGCT 102 0 

4 5 AAAGATGGGG ATTTGGTCCG CATTGGTAAA TTTGAGTTTG AATTT GTAGA CTAGGAGACT 108 0 

GGTATGGGAG ATAAACCGAT ATCTTTCCGA GATGCGGATG GTAATTTTGT TTCCGCCGCA 1140 

GACGTTTGGA ATGAAAAGAA ATTGGAAGAA CTATTTAATC GTCTCAATCC AAATCGTGCC 12 00 

50 

TTGAGATTGG CAC GAACTAC AAAGGAAAAT CCATCTCAGT AAAGAAGCTA AAAAA 1255 
(2) INFORMATION FOR SEQ ID NO: 3: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1609 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

60 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ili) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

10 

TTAC C CAT C G CAT G AC TAAA AATCTCTACT ATCCAATACT AGTT CAT ATT C T CAT CAAT A 60 

TCACTGCCTT CTGGGATGTT TGGTTACTCC TATTTTCAGG AAGTTAGCTT ACTAAAAAAA 120 

15 TGTCGGAATT TTCCGGCATT TTCTTTTTTC ACAAATAGTC AACGTTTTTC TTTCCGATAC 18 0 

TGAAGTGGTG TGTAGCCACT TATTTTTTTG AATTGATTTT GAAAATAAGA TTGGCGTGAG 24 0 

AAAG GCAGAT AGT GAAGATA GTTAAGAAGA ATAGGATGTT CTTTTTTCCT TTTT GGAAAA 30 0 

20 

CTTCTAAAAT ATGGTATAAT GAAAAGATAA AGAAGTTGGG GGTAGAAGAT GAACATTCAA 3 60 

CAATTACGCT ATGTTGTGGC TATTGCCAAT AGTGGTACTT TTCGTGAAGC TGCTGAAAAG 420 

2 5 ATGTATGTTA GTCAGCCGAG TCTGTCTATT TCTGTTCGTG ATTT GGAAAA AGAGTTGGGC 48 0 

TTTAAGATTT TCCGTCGGAC CAGCTCAGGG ACTTTCTTGA CCCGTCGTGG GATGGAATTT 54 0 

TATGAAAAAG CGCAAGAATT GGTTAAAGGA TTT GAT ATTT TTCAAAATCA GTATGCCAAT 600 

30 

CCTGAAGAAG AAAAAGATGA ATTTTCCGTT GCTAGCCAGC ACTATGACTT CTTACCACCA 660 

ACTATTACGG CCTTTTCAGA GCGCTATCCT GACTATAAGA ACTTCCGTAT TTTT GAATCA 72 0 

3 5 ACTACTGTTC AAATATTAGA TGAAGTGGCG CAAGGGCATA GTGAGATTGG GATT AT CTAC 7 80 

CTCAACAATC AAAATAAAAA GGGGATTATG CAACGGGTTG AAAAGTTAGG TCTGGAGGTC 840 

ATCGAATTGA TTCCTTTCCA TACCCATATT TATCTCTGTG AGGGTCATCC TTTAGCCCAG 900 

40 

AAAGAGGAAT T AGT CAT GGA GGATTTAGCG GATTTACCAA CGGTTCGTTT CACTCAAGAG 960 

AAAGAC GAGT ACCTTTATTA TTCAGAGAAC TTTGTCGATA CCAGCGCTAC TCACAGATGT 102 0 

4 5 TTAATGTGAC AGACCGT GCC ACCTTGAATG GTATTTTGGA GCGGACGGAC GCCTATGCGA 108 0 

CAGGTTCTGG ATTTTTAGAT AGT GACAGT G TTAATGGCAT TACA GTTATT CGTCTCAAGG 114 0 

ATAACCTAGA TAACCGCATG GTCTATGTTA AACGTGAAGA AGTGGAGCTT AGT CAAGCTG 1200 

50 

GGACTCTCTT CGTAGAAGTC ATGCAAGAAT ATTTTGATCA AAAGAGGAAA TCATGAAAAA 12 60 

AAGAGCAATA GTGGCAGTCA TTGTACTGCT TTTAATTGGG CTGGATCAGT TGGTCAAATC 132 0 

5 5 CTATATCGTC CAGCAGATTC CACTGGGTGA AGTGCGCTCC TGGATTCCCA ATTT C GTTAG 138 0 

CTTGACCTAC CTGCAAAATC GAGGTGCAGC CTTTTCTATC TTACAAGATC AGCAGCTGTT 144 0 

ATTCGCTGTC ATTACTCTGG TTGTCGTGAT AGGTGCCATT TGGTATTTAC ATAAACACAT 1500 

6 0 

GGAGGACTCA TTCTGGATGG TCTTGGGTTT GACTCTAATA ATCGCGGGTG GTCCTGGAAA 1560 
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CTTTATTGAC AGGGTCAGTC AGGGCTTTGT TGTGGATATG TTCCACCTT 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 763 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
GTAACTTAGG GCCCCAAGTC CATAACTTGC TTGACGCATG C TAT CACTAA C A GAT AAAAG 60 

GGCTTCTTCT GTTGAGCGAA TAATGACTGG CAACACCATG ATGACTGAGG TTAAGATTCC 12 0 

TGATAACAGA GAGTATT GAA AACCTAAGAA GACTACAAAG AAGAG CAT GC CAAACAGACC 18 0 

AAAAACAATG GAAGGAATCC CAGACAAGGT ATCTGAGGCC AATCGCATGA TTTTAACACA 24 0 

AAGGGAATCT TTTTTTGTAT ATTCCACAAG ATAAAAACCA GCAAAAAT C C CTATGGGCAA 300 

GGCTAAAAGA AGAGCACCAA AGACCAGAAT AACGGTGGAA ATAATCGCTG GCATAAGGGA 3 60 

AATGTTCTCA GAAGTATAAG TCCAAGAAAA GAGGGATAGA CTTAGATGAG GTAAACCTTT 4 20 

GATGAGGATA AAACCAATGA TTAAAAAGAG AGAGC CAAAG GTTAAAGCTG AAAAACAATA 4 80 

AACGAGAAGT TTTAGCAGGT ATTTACTCAT AAGATGATTT TCCTTTCAAG TAG C CAAAGT 54 0 
AGGCATTAAT CAAGAGAATA AGGAAAAAGA GAACTGCTGA GGTTGCAATA AGGGCTTCCC 
TATGCTGACC TGATGCGTAA GCCATTTCCA GAACAATATT GGTTGTTAAG GTTCTGGTTC 
CTGAAAAGAG TCCACTTGGA ATAATCGGCT GGTTGCCTGC CACCAAAATA ACTGCCATGG 

TTTCACCTAC TGCGCGACCG ATGCCTAAAA TAACTGCTGA AAA 7 63 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GGGTCTGTTT TGGCCTTGGC GGCTTCAGGT GGTTCAGGAG CTTGGCAGGG AGCTGGTCTC 60 

ATGTTGGTGT ATACGCTGGG CTTGGCGCTA CCATTCTTGC TTCTAGCTCT GACCTCTAGT 120 

TATGTTTTGA AACATTTCCG AAAACTTCAT CCCTATCTCG GAATCCTCAA AAAAGTGGGT 180 

GGTTTTCTCA TTATTGTGAT GGGATTCTTG GTTCTGTTTG GAAATGCTTC AATTTTAAGT 24 0 

CAATTATTTG AATAAAATGG AAAG GAAT AT CAAT AT G AAA AAATGGCAAA CATGTGTTCT 300 

TGGAGCAGGT TCGCTCCTTT GTTTGACGGC TTGTTCAGGC AAGTCCGTGA CTAGTGAACA 360 

CCAAACGAAA GATGAAATGA AGACGGAGCA GACAGCTAGT AAAACAAGCG CACTAAAAGG 42 0 

GAAAGAGGTG GCTGATTTTG AATTGATGGG AGTAGATGGC AAGACCTACC GTTTATCTGA 480 

TTACAAGGGC AAGAAAGTCT AT CT CAAATT CTGGGCTTCT TGGTGTTCCA TCTGTCTGGC 54 0 

TAGTCTTCCA GATACGGATG AGATTGCTAA AGAAGCTGGT GATGACTATG TGGTCTTGAC 60 0 

AGTAGTGTCA CCAGGACATA AGGGAGAGCA ATCTGAAGCG GACTTTAAGA ATTGGTATAA 660 

GGGATTGGAT TAT AAAAAT C TCCCAGTCCT AGTTGACCCA T CAG G CAAAC TTTTGGAAAC 720 

TTATGGTGTC CGTTCTTACC CAACCCAAGC CTTTATAGAC AAAGAAGGCA AGCTGGTCAA 78 0 

AACACATCCA GGATTCATGG AAAAAGATGC AATTTTGCAA ACTTTGAAGG AATTATCCTA 840 

GGAGGCGTCT TATGAATGAT AAGTTAAAAA TCTTCTTGTT GCTAGGAGTA TTTTTTC 8 97 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3499 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

TTTCTTTTTC CTAGGTGATT TTAATGAGGT TGAAATTCAA AAT GTATTAG AATCATTTGG 60 

CTTTAAAGGT CGAAAAGGAG ATGTGAAGGT TCAGTATTGT CAACCTTATT CTAATATCCT 120 

TCAGGAAGGT ATGGTTCGGA AAAATGTGGG ACAATCCATT TTGGAATTAG GTTAT CATT A 18 0 
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CCGTTCTAAA TATGGTGATG AGCAACATTT 
TGGATTTGCT CACTCTAAGC TCTTTACAAA 
5 CATTTCAAGT GAGCTTGATT TATTTAGTGG 
AGAAAATCGT AACCAGGCTC GTAAAATGAT 
TTATTTTACA GAGTTTGAGT TAAATCAGAC 

10 

TTCTCAAGAT AATCAAT CTT CAT T GATT G A 
ATCTTCAGCA GACTTTAAAA GTTGGATTGC 
15 TTGTAGAGTA GCTAATAATG TGAAACTACA 
ACAAAGGTTG TTTTTGAAGA AAAAT AC TAT 
CGTTTGGCCA ACGGATTGAC AGTTGCTCTT 

20 

GGGAGTGTCA CTGTACAGTT TGGTTCGGTA 
GTAAAACAAT ATCCTGGAGG AATTGCTCAT 

2 5 GATTCTAGTG ATTTGATGTC GGCTTTTACG 

AGCTTTACAA AAACAAACTA TCTTTTTTCA 
TTACTTGATG AATTGGTAAC AT CAGCACAC 

30 

GATATTATTC AGCAAGAACG AGAAAT GT AC 
TCAACTTTAG CGAATTTGTA TCCTGGTACA 

3 5 GAGTCCATTT CCCAAATCAA TCTAACTAAT 

CCTGTAAACA TGTCTCTGTT TTTAGTTGGT 
TTTGAAAGCA AAGAACTGAA AGATTCAGAT 

40 

TTACAGCCTG TAAAGCCAAC AGATAGTATG 
ATTGGAGTTA GAGGTAAGCG AGAAGTTTCT 

4 5 TT AAAAT TAT TGTTTGCAAT GATGTTTGGT 

TGAATCAGGT AAAATTGATG CGTCCTTATC 
TTTGTCATGT TGACAATAGA TACGAAAGAG 

50 

GCTATTCGTA ATTTTACAAA GGATTTAGAT 
AGAGAGATGT TTGGCGAATT TTTCAGTAGC 
55 TATGATGCTT TTGAAAATGG TGAGACAATT 
ACTTTAGAGG ATGTCCTTGA TGCTGGACAT 
TTTACAATAT TCCCATCGTA GTAAC CTAT C 

60 

G TAT GAGAAA AAAAACAATT GGAGAGGTTT 



ACCCATGATT GTAATGAATG GTTTACTTGG 24 0 

TGTCCGTGAA AATGCTGGAT TAGCTTATAC 3 00 

ATTCTTGAGG ATGTATGCTG GTATCAATCG 3 60 

GAATAATCAA CTGCTTGATT TAAAAAAAGG 42 0 

CAAGGAAATG ATTCGTTGGT CGTTGTTACT 480 
ACGTGCTTAT CAAAATGCCT TATTTGGAAA 54 0 

AAAGCTTGAA CAAATTGACA AAGAT GCTAT 600 
AGCGATTTAC TTTATGGAAG GAATAGAATG 660 
CCAGCTGTAA AAGAAAAGGT TTATCGAACT 720 
TTGCCTAAAA AGGAATTTAA AGAGGTTTAC 7 80 

GATACGTTTG TCACAGAAGT TGACGGATAT 8 40 

TTTCTTGAAC ATAAATTATT TGAGAGAGAA 900 
AGTCTAGGTG CAGATAGTAA TGCCTTTACA 960 

GCAACGGATT ATTTTTTAGA AAATTTAGAT 1020 

TTTACTGAAG CTTCCATTCT GACAGAGCAG 1080 

C AAGAT GAT C CAGATTCGTG TTTATTCTTT 114 0 

CCTTTAGCAA CTGATATAGT TGGAAGTGAG 12 00 

TTGCAAGAAA ATTTTACAAA GTTTTACAAA 12 60 

AATTTT GAT G TGGAGCGAGT ACAGGACTAT 1320 

TTT CAGGAAG TAGCAAGAGA AAAGTTGTTT 1380 

AGAATGGAAG TATCTTCTCC CAAACTAGCG 1440 

GAAGCGGATT GCTATCGACA TCATATTTTA 1500 

TGGACTTCGG GATCGTTTTC AAAAATGTTA 1560 

TCTGGAAGTT AAATAACAAG TCGCTTTCAT 162 0 

CCAGTTGCTT TGTCTCATCA ATTTAGGAAG 168 0 

ATTACAGAGG AACATTTAGA TATTAT CAAA 174 0 

ATGAACTCTC TTGAATTTAT TGCAACGCAA 18 00 

TTTGATTTGC CGAAAATTTT ACAGGAAATT 18 60 

CATTTAATAG ATGATGGTGA CATAGT T GAT 192 0 

ATAATAGACA CTAGAAAGAA GGGATGACAA 19 80 

TACGATTAGC TAGAATCAAT CAGGGATTGA 2 04 0 
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GTTTAGATGA ATT G CAGAAA AAGACAGAAA TCCAGTTAGA TATGTTGGAA GCAATGGAAG 2100 

CAGACGATTT CGATCAACTT CCAAGTCCTT TTTACACGCG TTCTTTCTTG AAAAAATATG 2160 

5 

CATGGGCTGT TGAGTTAGAT GACCAAATTG TTTTGGATGC TTATGATTCT GGGAGTATGA 2220 

TTACTTATGA GGAAGTAGAT GTTGATGAAG ATGAGTTGAC AGGTCGTAGA CGTTCAAGTA 22 80 

10 AGAAAAAGAA GAAAAAAACA TCATTTTTAC CTTTATTTTA TTTTATCCTT TTTGCTTTAT 2 34 0 

CGATTTTAAT TTTTGTGACT TATTATGTTT G GAAC TAT AT TCAAACTCAA CCAGAGGAGC 24 00 

CTTCTCTTTC TAATTACAGT GTGGTT CAAT CAACAAGTTC AACTAGCTCT GTTCCCCACT 24 60 

15 

CCTCAAGTAG TAGTTCTTCT AGTATAGAAT CAGCTATAAG TGTATCAGGC GAAGGAAATC 2520 

AT GT AGAAAT CGCTTATAAG ACAAGTAAGG AAACAGTTAA ATTGCAATTG GCAGTTTCAG 2 58 0 

2 0 ATGTTACAAG TTGGGTCAGT GT TT CAGAAA GCGAACTTGA GGGCGGTGTA ACCTTATCGC 2640 

CAAAGAAGAA AAGTGCAGAA GCAACAGTTG CAACTAAAAG TCCTGTAACA ATTACGTTAG 27 00 

GTGTTGTAAA AGGTGTTGAT TTGACAGTAG ATAATCAGAC TGTTGATTTA TCGAAATTAA 27 60 

25 

CAGCTCAGAC TGGACAAATC ACTGTAACCT TTACTAAAAA TTAAGGAAAA AC GAATGAAA 2820 

AAAGAACAAA TTCCCAATCT CTTAACAATA GGTCGAATTC TCTTTATACC TATTTTTATC 28 80 

3 0 TTTATTTTAA CGATAGGAAA TT C GATAGAG AGTCATATAG TTGCAGCTAT TATCTTTGCT 294 0 

GTTGCCAGTA TTACCGACTA TTTAGATGGA TATTTAGCTC GTAAATGGAA TGTGGTCAGT 3000 



AATTTTGGTA AATTTGCAGA TCCTATGGCG GATAAGTTAC TAGTTATGTC GGCTTTTATT 30 60 

35 

AT G TT GATT G AGTTAGGTAT GGCTCCGGCT TGGATTGTTG CAGT GATT AT CTGTCGTGAG 3120 

TTAGCT GTGA CAGGTTTAAG GCTTTTATTG GTTGAAACTG GT G GAACAAT TTTAGCAGCA 318 0 

4 0 GCAATGCCTG GAAAAATTAA AACTTTTAGT CAGATGTTTG CTATTATTTT CTTGCTATTA 324 0 

CATTGGACTT TGCTTGGTCA AGTTCTACTT TATGTAGCCT TATTTTTCAC TATCTACTCT 3300 

GGCTATGACT ATTTCAAGGG TAGTGCCTAT GTATTTAAAG GGACATTTGG TTCGAAATGA 3360 

45 

AATCAATAAT T GAT GTAAAA AATCTTTCTT TTCGCTATAA AGAAAAT CAG AACTACTACG 34 2 0 

ATGTGAAGGA TATTACGTTT CACGTGAAAC GTGGAGAATG GCTTTCGATT GTAGGGCATA 34 8 0 

5 0 ATGGTAGTGG TAAATCAAC 34 9 9 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 821 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

6 0 (ii) MOLECULE TYPE: DNA (genomic) 
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(ill) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

10 ATTTTTGAAT AATCAAGCGG AACCAAGAGG TCTTCGTCCT TCATCTTGTT AAT CAT GT AT 60 

TCACTTGGAA TGGCAATATC GTAGGTCGTT CCACCCTGCT TTATCTTAGT GTACATGGCT 12 0 

TCGTTGGAGT CAAAAGCCTC GTACTGAACT TGAATTCCTG TTTCTTCTGT AAACTGAGTC 180 

15 

AAGAGTTCAG GATCGATATA GTCTCCCCAG TTATAGATAA CCAATTTTTG ACTATCTCGA 24 0 

CTATT GATTT TACTATCTAA ATGAGTCGCA ATTCCCCACA AGACAAGGAT AATCGCTGCA 300 

2 0 ATTCCTGCTA AAAT GAAT AG ATTTTTTTCA TGCTTGCTCC TCCTTCTCAC GAGA GAT AAA 360 

GTAATAACCT ACAACTAGGA TAATAC T AAA GAGAAAGACT AGAGCAGAAA GGGCATTGAT 42 0 

TTCTAGCGAA ATCCCCTTGC GAGCACGAGA GTAAATCTCG ACTGATAGGG TTGAAAAGCC 4 80 

25 

ATTTCCTGTT ACAAAGAAGG TCACGGCAAA GTCATCTAAC GAAT AG GT GA AGGCCATGAA 54 0 

ATAACCAGCA ATGATAGACG GAGT CAGGTA AGGAAGCATG ATTTCCTTAA ACATCTGAAA 60 0 

3 0 TTGACTAGCT C C CAAGT CAT AGGCCGCATG AATCATGTCG CCATTCATTT CCTTGAGTCG 660 

GAGGCAAGAC CAT CAAGACC ACGATAGGAA TGGAGAAGGC CACATGACTA GATAGAACGG 72 0 

TCAAAAAGCC AAGTGAAAAC TTGAGTTGGG TAAAGAGAAT CAAGAAGTAG CACCAATCAT 78 0 

35 

AACGTCAGGC GCAACCATGA G GAT AT TAT T GAGT GATAGA A 821 
(2) INFORMATION FOR SEQ ID NO : B : 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1309 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
50 (iv) ANTI-SENSE: NO 



55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCCGGTGCC ACAGTCCAAG C TAT C G GT AT CGTGATTGAG AAATCCTTCC AAGATGGTCG 60 

TGATTTGCTT GAAAAAGCAG GCTACCCTGT CCTATCACTT GCTCGCTTGG ATCGTTTTGA 12 0 

60 

AAATGGTCAG GTCGTATTTA AGGAGGCAGA TCTCTAATGC AAACTCAAGA AAAACACTCG 180 
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240 



CAAGCAGCCG TTCTCGGCTT GCAGCACTTA CTAGCCATGT ACTCAGGATC TATCCTGGTT 
CCCATCATGA TTGCGACAGC CCTTGGCTAT TCAGCTGAGC AGTTGACCTA CCTGATTTCT 3 00 

ACAGATATCT TCATGTGTGG GGTGGCAACC TTCCTCCAAC TCCAACTCAA CAAATACTTT 
GGGATTGGAC TCCCAGTCGT TCTTGGAGTT GCATTCCAGT CGGTCGCTCC CTTGATTATG 
ATTGGGCAAA GCCATGGTAG TGGCGCTATG TTTGGTGCCC TTATCGCATC TGGGATTTAC 
GTGGTTCTTG TTTCAGGCAT CTTCTCAAAA GTAGCCAATC TCTTCCCATC TATCGTAACA 
GGATCTGTTA TTACCACGAT TGGTTTAACC TTGATCCCTG TCGCTATTGG AAATATGGGA 
AATAACGTTC CAGAGCCAAC TGGTCAAAGT CTCTTGCTTG CAGCTATTAC TGTTCTGATT 
ATCCTCTTGA TCAACATCTT TACCAAAGGA TTTATCAAGT CTATCTCTAT TTTGATTGGT 72 0 

CTGGTTGTTG GAACTGCCAT TGCTGCTACT ATGGGCTTGG TGGACTTCTC TCCTGTTGCG 78 0 

GTAGTCCACT TGTCCATGTC CCAACTCCAC TCTACTTTGG GATGCCAACC TTTGAAATCT 
CATCTATTGT CATGATGTGT ATCATCGCAA CGGTGTCTAT GGTTGAGTCA ACTGGTGTTT 
ATCTAGCCTT GTCTGATATC AC AAAAGAT C CAATCGACAG CACGCGCCTT CGCAACGGTT 960 
ACCGCGCAGA AGGTTTGGCG GTACTTCTCG GAGGAATCTT TAACACCTTC CCTTACACCG 
GATTTTCACA AAACGTTGGT TTGGTTAAAT TGTCAGGCAT CAAAAAACGC CTGCCAATCT 
ACTACGCAGG TGGTTTCCTG GTTCTCCTTG GACTGCTTCC TAAGTTTGGT GCCCTTGCCC 
AAATCATTCC AAGCTCCGTC CTCGGCGGTG CCATGCTGGT GATGTTTGGT TTTGTATCTA 
TTCAAGGGAT GCAAATCCTC GCCCGAGTTG ACTTTGTAAC AATGAACACA ACTTCCTTAT 
CGCAGTGTTT CAATCGCTGC AGGTGTCGGT CTCAACAACA AGTAATCTC 1309 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1031 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



360 
420 



660 



840 
900 



1020 
1080 
1140 
1200 
1260 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TTAAAGTTCC AGTTTATCTA GGTTCTTCAT TTGCCTTTAT CACAGCTATG TCACTGGCTA 
TGAAAGAAAT GGGGGGTGAT GTATCTGCTG CCCAAACAGG GGTTATCTTG ACTGGTTTGG 



60 
12 0 
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TCTATGTCCT TGTTGCTACC AG CAT CC GAT TTGTAGGAAC AAAATGGATT GATAAACTCT 180 

TGCCACCAAT CATTATCGGT CCTATGATCA TCGTTATCGG TCTTGGACTT GCAGGTTCAG 24 0 

5 

CTGTTACCAA TGCAGGTCTT GTAGCAGACG GAAATTGGAA AAATGCTCTG GTAGCCGTTG 3 00 

TTACTTTCCT AATTGCTGCC TTTAT CAATA CAAAAGGAAA AGGCTTCCTA CGAATCATTC 3 60 

10 CATTCCTCTT TGCCATTATC GGTGGTTACC TTTTCGCACT AACTCTTGGC TTGGTTGACT 42 0 

TTACACCAGT TCTTAAAGCC AACTGGTTCG AAATTCCTGG TTTCTACTTG CCATTTAGCA 480 



CAGGTGGTGC CTTTAAAGAG TACAATCTTT ACTTTGGTCC AGAAGCCATC GCTATCTTGC 54 0 

15 

CAATCGCTAT CGTAACAATT TCTGAACATA TCGGAGACCA TACTGTTTTG GGTCAAATCT 60 0 

GTGGCCGTCA ATTCTTAAAA GAACCAGGTC TT CAT C GT AC TCTTCTTGGT GACGGTATCG 66 0 

2 0 CAACTTCTGT TTCTGCCTTC CTTGGTGGAC CAGC CAATAC AACTTACGGA GAAAATACAG 72 0 

GGGTTATCGG TATGACTCGT ATCGCTTCTG TCTCAGTTAT CCGTAACGCT GCCTTCATCG 780 

CGATTGCCCT CAGCTTCCTT GGTAAATTCA CTGCCTTGAT TTCAACTATT CCAAACGCTG 84 0 

25 

TACTTGGTGG TATGTCAATC CTTCTCTATG GGGTTATCGC CAGCAATGGT TTGAAAGTCT 900 

TGATTAAAGA ACGTGTTGAT TTCGCTCAAA TGCGAAACCT CATCATCGCA AGTGCTATGT 960 

3 0 TGGTTCTTGG ACTTGGAGGA GCTATCCTTA AACTTGGTCC AGTACACTTT CAGGTACTGC 102 0 

CCTTTCAGCC A 1031 
(2) INFORMATION FOR SEQ ID NO: 10: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 568 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

4 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

45 

(iv) ANTI-SENSE: NO 



50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 
ACAGTTTAAT CATTGCCTTG GCTACAACCC TCATTGCGAT TATTATTTCT GCTATGGCAG 60 
5 5 CCTATGGTAT TGTTCGATTC TTTCCTAAAT TGGGAGCAAT CAT GT C GAGA CTACTCGTCA 12 0 



TTACCTACAT TTTCCCACCA ATTTTGTTAG CAATTCCCTA TTCAATTGCC ATTGCTAAAG 18 0 



TTGGGTTAAC AAATAGTTTA TTTGGCTTGA TGATGGTTTA TCTATCTTTT AGTGTTCCAT 24 0 

60 

ATGCAGTTTG GCTCTTAGTT GGATTTTT CC AAACAGTTCC AATTGGAATT GAAGAAGCGG 300 
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CTAGAATTGA TGGTGCAAAT AAATTTGTTA CGTTTTATAA AGTTGTGCTA CCGATTGTAG 360 

CACCAGGTAT TGTAGCAACA GCTATTTATA CATTTATCAA TGCTTGGAAT GAATTTCTGT 42 0 

ATGCCTTGAT TTTGATTAAC AATACAGGAA AGATGACAGT AGCAGTAGCC CTTCGTTCAC 460 

TTAATGGTTC AGAAATACTA GACTGGGGAG ATATGATGGC AGCGTCTGTT ATTGTAGTTC 54 0 

TTCCATCAAT TATTTCTTCT CTATCACC 56 8 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
ACCAAAGACT TAGCTTCTTC AAAAAGCGGA TCACCACCAG CATCTCCATC CGAAAATTCT 60 
CCTTCATTTT CAGAAACCTC ACCTGGATCA AAACT C T CAT CGTAGTCTGC ATCTGCCTGA 120 
GTCTTGATGA AGTTCACAAT GCGCTCAACA TCGTCATCCG AGATAAAGGA GCCTTGGAGA 180 
CGAACTGGAT GATTTTCATC AATCGGTTTA AAGAGCATGT CTCCTCGACC AAGAAGTTTT 240 
TCTGCTCCAT TTTCATCCAA AATCGTACGG GAGTCTGTTC CTGATGAAAC CGCAAATGCT 300 
ACACGAGATG GAACATTGGC CTT AAT C AAA CCAGAGATGA CATCAACAGA TGGACGCTGA 360 
GTTGCAAGAA TCATGTGGAT ACCTGCAGCA CGCGCCTTCT GCCCAAGACG GATGATAGCA 420 
TCTTCCACTT CCTTGCTGGC CACCATCATG AGGTCAGCCA ACTCATCC 4 68 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 466 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANT I- SENSE: NO 
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!xi) SEQUENCE DESCRIPTION: S EC ID NO: 12: 
5 AAGCTGACAA TCTTTTCTGC AGTTGGAGCA TCCCAGAAGG ATACACCACT AAGGATGCGA 6 0 



CCTGCCTTGC TATCAACAAT AATGTCTTGA ACCTTGTAGT CATCTCCATA GACCAAGAAC 12 0 



CATTCGTTGG TACAATCTTC AC GAT AAACA CTAAAATAAG TCGAACGAGT C AAAT CAT T G 18 0 

10 

CGGAACATAT TTTTAA^GAG ATAGTTATCT GCATCAATAA CATAGCTGTT GGCCAATTCT 24 0 



TCTTTTACAA GATAGAGAGA GTAAAAGTTA TTGTAGTCAG CGTATTTATC ATTGAAAACG 300 
15 AGACGAACAC CGTATTTCTC TTTCAAGTAA TCGAATTGTT CTTTAAGATA ACCAACAATG 360 



AT GAT GAT GT CATTGATTCC TTTTTCTTTG AGAAACTCAA TTTGGTACTC AATCAAAGGT 42 0 



TTTTGATTAA CCTGAACCAA GGCTTTAGGG GTATTTT CAG T CAT AG 4 66 

20 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1040 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

30 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
4 0 CACATAATCT GTATATTGAC TATAAGTTTT AAAAAACAAT TTTTAAGCTC TTCCTTGTCT 60 



TCTCTAACCA AGCGTGTTAT AAT GAATACT GCTCAAGCGA CCTTCAATCG TGAAGCACAC 12 0 



ACGACCTTCA ATCGTGAATA AACGAATAGA TGGGAGACTT ACCATGAGTG ATAACTCTAA 160 

45 

AACACGTGTT GTCGTGGGGA TGAGTGGTGG TGTTGATTCG TCGGTGACGG CTCTTTTGCT 24 0 

CAAGGAGCAG GGCTACGATG TGATCGGTAT CTT CAT GAAG AACTGGGATG ACACAGATGA 300 

5 0 AAACGGCGTC TGTACGGCGA CCGAAGATTA CAAGGATGTG GTTGCGGTGG CAGATCAGAT 360 



TGGCATTCCC TACTACTCTG TCAATTTTGA AAAAGAGTAC TGGGACCGCG TTTTTGAGTA 42 0 



TTTCCTAGCT GAATACCGTG CAGGGCGCAC GCCAAATCCG GACGTTATGT GCAACAAGGA 4 80 

55 

AATCAAGTTC AAGGCCTTTT TGGACTATGC CATGACCTTG GGGGCAGACT ATGTAGCGAC 54 0 



TGGGCACTAT GCTCGAGTGG CGCGTGATGA GGATGGCACT GTTCACATGC TTCGTGGCGT 600 
6 0 GGACAATGGC AAGGATCAGA CCTATTTCCT CAGCCAACTT TCGCAAGAAC AACTTCAAAA 660 
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AACCATGTTC CCACTAGGAC ATTTGAAAAA GCCTGAAGTT CGAAAACTAG CAGAAGAAGC 72 0 

AGGTCTTTCG ACTGCTAAGA AGAAAGACT C GACAGGGATT T3CTTTATCG GAGAAAAGAA 78 0 

CTTTAAAAAC TTTCTCAGCA ACTACCTGCC AGCTCAGCCT GGTCGTATGA TGACTGTGGA 840 

TGGTCGTGAT ATGGGCGAGC ATGCTGGTCT TATGTACTAT ACAATCGGTC AGCGTGGCGG 900 

ACTCGGTATC GGTGGGCAAC ACGGTGGTGA CAATGCCCCT TGGTTCGTTG TCGGAAAAGA 960 

TCTAAGCAAG AATATTCTCT AT GT AG G C CA AGGTTTCTAC CAT GATT C G C TCATGTCAAC 102 0 

CACTAGAGGC TAGCCAAGTC 104 0 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3071 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CCGGGGATCT GATAGCCAAT AGAAAACCGC AGAGT CAAAG GGTTTTGTAT GAATTGCGAG 60 

ATCGTTTGAA GAGAAATCAG T T TATA CT CA AT GATAC C AA TCCGGATATT GTCATTTCCA 120 

TTGGCGGGGA TGGTATGCTC TTGTCGGCCT TT CATAAGTA CGAAAATCAG CTTGACAAGG 180 

TCCGCTTTAT CGGTCTTCAT ACTGGACATT TGGGCTTCTA TACAGATTAT CGTGATTTTG 24 0 

AGTTGGACAA GCTAGTGACT AATTTG CAAC TAGATACTGG GGCAAGGGTT TCTTACCCTG 3 00 

TTCTGAATGT GAAGGTCTTT CTTGAAAATG GTGAAGTTAA GATTTTCAGA GCACT CAACG 3 60 

AAGC CAG CAT CCGCAGTCTG AT C G AAC CAT GGTGGCAGAT ATT GTAATAA ATGGTGTTCC 420 

CTTTGAACGT TTTCGTGGAG AC G G G CT AAC AGTTTCGACA CCGACTGGTA GTACTGCCTA 480 

TAACAAGTCT CTTGGCGGTG CTGTTTTACA CCCTACCATT GAAGCTTTGC AATTAACGGA 54 0 

GATT GC CAG C CTTAATAATC GTGTCTATCG AACATTGGGC TCTTCCATTA TTGTGCCTAA 600 

GAAGGATAAG ATTGAACTTA TTCCAACAAG AAAC GATT AT CATACTATTT CGGTTGACAA 660 

TAGCGTTTAT TCTTTCCGTA ATATTGAGCG TATTGAGTAT CAAATCGACC AT C AT AA GAT 720 

TCACTTTGTC GCGACTCCTA GCCATACCAG TTT CTGGAAC CGTGTTAAGG ATGCCTTTAT 7 80 

CGGTGAGGTG GATGAATGAG GTTTGAATTT ATCGCAGATG AACATGTCAA GGTTAAGACC 84 0 
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TTTTTAAAAA AGCACGAGGT TTCTAAGGGA 
GCTATTCTGG T CAATAAT CA ACCGCAAAAT 
5 GTTACCATTG ACATTCCCGC TGAGAAAGGC 
TTAGAT ATT C TCTATGAGGA TGACCACTTT 
TCTATTCCTA GTGTTAATCA CTCTAATACC 

10 

AAGCAAAATT AT GAAAAT CA GCAGGTTCAC 
GGCTTGATGC TCTTTGCCAA GCACGGTTAT 
15 AAGAAATCTA TCGAGAAACG CTACTTTGCT 
GAAGGGGAAA TTATTGCTCC GATTGCGCGT 
GCTAAAGGCG GAAAGTATGC CCATACTTCA 

20 

CACTTGGTCT ATATTCACCT GCACACTGGT 
CATATCGGTT TTCCTTTGCT GGGAGATGAT 

2 5 CAACGTCAGG CTCTGCATTG CCATTACCTA 

TTGCAGTTAG AAAGTCCCTT GCCGGATGAT 
AATACTCTAT AAAAACTGTC T C AGAGT AT A 

30 

AGTTTTT GAA AGTCTCAAAG CCAACCTTGT 
AG GGGAAGAG CCTCGTATTC TTCAAGCAAC 

3 5 TCCTGTTTTG CTTGGAAATC CTGAAAAAAT 

GGATGGTTAT GAG GT CAT C G ACCCTCAACA 
CTTGGTGGAG CGTCGCAAGG G CAAAAT GAC 

40 

AGATGTCAAC TACTTTGGTG TGATGTTGGT 
AGGAGCGATT CACTCAACAG CTT CAACAGT 

4 5 TCCAAATGTA ACTCGTACTT CAGGAGCCTT 

ATTTGGAGAC TGTGCCATTA ATATCAATCC 
CAACTCAGCA AT CACAGCTA AGATGTTTGG 

50 

TTCTACTAAA GGTTCAGGGT TTGGTGAAAG 
TGCTCACGAC TTGCGTCCTG ACCTTGAAAT 

5 5 TGTTCCCGAA ACTGCAGCTC TGAAAGCTCC 

CTTCATCTTC CCAGGTATCG AGGCAGGAAA 
TGGCTTTGCG GCTGTAGGAC CTGTTTTGCA 

60 

TCGTGGATGT AATGCAGATG AT GTTTACAA 



TTGCTGGCCA AGATTAAGTT TCGAGGTGGA 90 0 

GCAACGTATC TATTGGACGT TGGAGACTAC 960 

TTTGAAACCT TGGAGGCTAT TGAGCTTCCA 102 0 

CTAGTCTTGA ATAAACCCTA TGGAGTGGCT 108 0 

ATT GCCAATT TTATCAAGGG TTACTATGTC 114 0 

ATTGTTACCA GACTAGATAG GGACACTTCT 12 00 

GCCCATGCAC GATTAGACAA GCAGTTGCAG 1260 

TTGGTTAAGG GAGATGGACA TTTGGAGCCA 132 0 

GATGAAGATT CCATTATTAC CAGACGAGTG 138 0 

TACAAGATTG TAGCTTCTTA TGGAAATATT 14 4 0 

CGAACCCATC AAATCCGAGT CCATTTTTCT 1500 

TTGTATGGTG GTAGTCTGGA AGAT G GT ATT 1560 

TCCTTTTATC ATCCATTTTT AGAGCAAGAC 162 0 

TTCAGTAACC TTATTACC CA GTTATCAACT 168 0 

ATTATTATCT TAAAGGAGAA AACTCATGGA 174 0 

TGGTAAAAAT GCTCGTATCG TTCTCCCTGA 18 00 

AAAACGCTTA GTAAAAGAAA CAGAAGTGAT 1860 

TAAAATTTAT CTTGAAATTG AAGGAATCAT 1920 

TTATCCTCAA TT T GAAGAAA TGGTTTCTGC 198 0 

TGAAGAAGAT GTACGCAAGG TTTTGGTTGA 2 04 0 

TTACTTGGGC TTGGTTGATG GAATGGTGTC 2100 

TCGCCCAGCT CTACAAATCA TCAAAACTCG 2160 

CCTCATGGTT CGTGGTACGG AACGTTACCT 2220 

AGAT GCAGAA GCCTTGGCTG AAATTGC CAT 2280 

CATCGAACCT AAAATTGCCA TGTTGAGCTA 2 34 0 

CGTTGATAAG GTCGTTGAAG CAACTAAAAT 24 0 0 

CGATGGTGAG TTGCAATTTG ATGCGGCCTT 2 4 60 

GGGAAGTACA GTAGCTGGTC AAGCAAATGT 2 520 

TATCGGTTAC AAGATGGCTG AACGCCTGGG 2 58 0 

AGGTTTAAAC AAGC CAGTTA ATGATCTTTC 2 64 0 

GTTGACCCTC AT CACAGCAG CTCAAGCAGT 2700 
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TCATCAATAG TGAAAACTAT AAAGTGATAT ACTAT GCTAT ACTGTAGTTA TGAAACTATG 27 60 

TACGAAAAGC ACTGCCATTA ATT C CT GAGA ACTAAATTAC TGATTGGTGT CAAAAAGGAA 2 82 0 

5 

AACTTCCAAG CGATGATATC CTGTCTATAC ACGACCTATA GAAATCTGTA ATATACATGT 2B8 0 

CCGTAAAACG ATAAATTCCC TTTTTGATTT TAAATGAGTA TGAAAAGAGA ATTTTCCGGC 2 94 0 

10 TCTTTGTCAA CTGTAGTGGG TTGAAAAAAA GCTAAGCTCG AGAAAGGACA AATTTTGTCC 3000 

TTTCTTTTTT GATATTCAGA GCGATAAAAA TCCGTTTTTT GAAGTTTTCA AAGTTTCGAC 3060 

TCTAGAGGAT C 3071 

15 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 720 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

25 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

3 5 TTTCCATGGT ATGGTAAAGG TTTTTCTTTT TTTTAAAAGG AAAAC GAG AA GAGGAGGTTC 60 

TTATGAAAGC AAGCATTGCC TTGCAAGTTT TACCCCTAGC ACAGGGGATT G AT C G GAT AG 120 

CTGTTATTGA TCAGGTCATT GCTTATCTGC AAACTCAAGA AGTGACGATG GTAGT GACAC 180 

40 

CATTTGAAAC GGTCTTGGAA GGGGAGTTTG AT GAG CT TAT GCGCATTCTA AAAGAAGCGC 240 

TGGAAGTGGC AGGGCAGGAG GCAGACAATG TCTTTGCCAA TGTCAAAATA AATGTAGGAG 300 

4 5 AGATTTTAAG TATTGATGAG AAACTTGAAA AGTATACTGA GACGACACAT TAGTCTATTG 360 

GGCTTTCTCG GAGTATTGTC AATCTGGCAG TTAGCAGGTT TTCTTAAACT TCTCCCCAAG 42 0 

TTTATCCTGC CGACACCTCT TGAAATTCTC CAGCCCTTTG TTCGTGACAG AGAATTTCTC 4 80 

50 

TGGCACCATA GCTGGGCGAC CTTGAGAGTG GCTTTACTGG GGCTGATTTT GGGAGTTTTG 540 

ATTGCCTGTC TTATGGCTGT G CT CAT G GAT AGTTTGACTT GGCTCAATGA CCTGATTTAC 600 

5 5 CCTATGATGG TGGTCATTCA GACCATTCCG ACCATTGCCA TAGCTCCTAT CCTGGTCTTG 660 

TGGCTGGGTT ATGGGATTTT TGCCCAAGAT TGTCTTGATT ATCTTAACAA CAACCTTTCC 720 



60 (2) INFORMATION FOR SEQ ID NO: 16: 
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(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 852 base pairs 

(B) TYFE: nucleic acid 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

10 

(iv) ANTI-SENSE: NO 



15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GCCGTCATAA TCATGCGCCG AATCCGTCCC CATTAAAATC TGGGTCTGTA AAGACAATGA 60 

2 0 CTCCATGACG TTGGTGTAGA CGCTGAATCC GCTCTATGTC CTGGTCATTG ATGGCAGAAC 120 

CTCGAGTCTC ATAGGTCTCC ACATCGAAAT AACGCTTGAG ATTGACCGTA TCATCACGAC 180 

CTTCAACCAC GATAACTTGG GAAATTCTCT CTTTCATTAC TTGCTGTCCA ATCCCAAAAA 24 0 

25 

TGCGTTCTGC ATTTGCAGTC GTTGCTACCG CCAGCTCTTC TGTCGTCATA CCACGCAAGT 30 0 

CAG C GAT AAA GTCGACCACA TAGCGAGTAT AGGCTGTTTT ATTTTCACGA CCACGCTTGG 3 60 

3 0 GTACAGGTGC TAAGTAAGGC GCATCTGTTT CTACCAACAT CTTGTCCAAA GGTAACTCTT 420 

TAGCTGCTTC TTGGAGGTCA GTTGCCTTCT TGAAGGTCAC CACTCCTGAG AAGGAAATGG 4 80 

TCATACCAAG ATCCCGGTAC CGAGCCCACT CAAGCGTCCC TGAAAATGAA T G CAT GAT AC 54 0 

35 

CACCACGAGG ACCAACGCCC TCACTCTTGA T AAT CT CAT A GGTATCTTCC AGCGCATCAC 600 

GGGTATGGAC AACAAAAGGC AAAT C CAAGT CCTTAGATAG CTGAATCTGA CGGCGAAAAA 660 

4 0 CCTGCTCCTG CACCTCTTGG GCGCTGTCAT CCAATGGTAG TCTAAGCCAA TTTCACCTAA 720 

AGCCACAACC TTGGAATGTT TTAACTTATC CAACAAGTAA GCCTCAACTT CCTCTGTATA 780 

AGTACCAGCT TCTGTAGGAT GCCAACCAAT AGT C G CAT AG AGCTGCTCAT ACTCATCTAC 840 

45 

CAAACTCCAA GG 852 
(2) INFORMATION FOR SEQ ID NO: 17: 

5 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 868 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
60 (iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GGGGATCCTC TAGAGT C GAT ATCTACGGTC TCAACCGTAC AGGACTGTTG AACGATGTAC 60 

TGCAAGTTCT TTCAAATACA AC CAAGAATA TTTCAACGGT CAATGCCCAA CCAACCAAGG 120 

ATATGAAGTT TGCTAATATC CATGTGTCCT TCGGTATTGC CAACCTCTCT ACACTGACCA 180 

CGGTTGTCGA TAAAATTAAG AGTGTGCCAG AAGTTTACTC TGTCAAACGG ACCAACGGCT 24 0 

AGTTGTCCTA GCTCTTACTA GAAAGGCTAT TATGAAAATC ATTATCCAAC GGGTTAAAAA 300 

AGCCCAAGTG AGTATAGAAG GCCAGATTCA GGGAAAAATC AATCAGGGAC TTTTATTGCT 3 60 

GGTTGGTGTT GGACCAGAGG AC CAAGAGGA AGATTTGGAC TATGCTGTGA GAAAACTGGT 4 20 

CAATATGCGG ATTTTTTCAG ACGCAGAAGG CAAGAT GAAC CTGTCTGTCA AA GAT ATT GA 4 80 

AGGAGAAATC CTCTCTATTT CTCAGTTTAC CCTCTTTGCG GATACTAAGA AAGGCAATCG 54 0 

TCCAGCCTTT ACAGGGGCAG CTAAACCTGA TAT GG CATC A GACTTCTATG ATGCTTTCAA 600 

TCAAAAATTA GCGCAAGAAG TGCCCGTTCA GACAGGTATC TTTGGAGCAG ATATGCAGGT 660 

TGAGCTGGTT AATAACGGAC CTGTTACCAT TATCCTAGAT ACTAAAAAGA GATAAGAAAG 72 0 

ACCAAGCCCA GTCGGCTTGG TCTTTCTCAT CGAT CAT AAA AATACTCCAA AAAGAAATCG 7 80 

GTTCTTGATA TGCTTGGGGG ACTCTTTTCA GGCTTTGGCA GAT G CGATAG GAAGGGATGA 84 0 

GATGTCCTAG GGTGAGGAGA GTTCCCTG 868 
(2) INFORMATION FOR SEQ ID NO: IB: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1399 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CGGTCCTCGT CCGATTGACT CACACCTTAA GGCGTTTGAA GCTATGGGTG CCACTGCTAG 60 

CTACGAGGGA GATAACATGA AGTTATCTGC TAAAGATACA GGACTTCATG GTGCAAGTAT 120 

TTACATGGAT ACGGTTAGTG TGGGAGCAAC GATTAATACG ATGATTGCTG CGGTTAAAGC 180 

AAATGGTCGT ACTATTATTG AAAATGCAGC CCGTGAACCT GAG ATTAT T G ATGTAGCTAC 24 0 
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TCTCTTGAAT AATATGGGTG CCCATATCCG TGGGGCAGGA ACTAATATCA TCATTATTGA 3 00 

TGGTGTTGAA AGATTACATG GGACACGTCA TCAGGTGATT CCAGACCGCA TTGAAGCTGG 3 60 

5 

AA CAT AT AT A TCTTTAGCTG CTGCAGTTGG TAAAGGAATT CGTATAAATA ATGTTCTTTA 42 0 



CGAACACCTG GAAGGGTTTG TTGCTAAGTT G GAAGAAAT G GGAGTGAGAA TGACTGTATC 480 
10 T GAAGACAGC ATTTTTGTCG AGGAACAGTC TAATTTGAAA GCAATCAATA TTAAGACAGC 54 0 



TCCTTACCCA GGCTTTGCAA CTGATTTGCA ACAACCGCTT ACCCCTCTTT TACTAAGAGC 600 



GAATGGTCGT GGTACAATTG TCGAGTCGAT AC GATTTAC G AAAAACGTGT AAATCATGTT 660 

15 

TTTGAACTAG CAAAGATGGA TGCGGATATT TCGACAACAA AT G GT CAT AT TTTGTACACG 72 0 

GGTGGACGTG ATTTACGTGG GGCCAGTGTT AAAGCGACCG ACTTAAGAGC TGGGGCTGCA 78 0 

2 0 CTAGTCATTG CTGGGCTTAT GGCTGAAGGC AAAACTGAAA TTACCAATAT CGAGTTTATC 84 0 



TTACGTGGTT ATTCTGATAT TATCGAAAAA TTAC GTAATT TAGGAGCGGA TATTAGACTT 900 



GTTGAGGATT AAACCGTAGA GGTGTTTATG AATATTTGGA CCAAATTAGC AATGTTTTCT 960 

25 

TTTTTT GAAA CGGATCGCTT GTATTTGCGT CCTTTCTTTT TTAGTGATAG TCAGGACTTC 102 0 

CGCGAGATAG CTTCAAATCC AGAAAATCTT CAATTTATTT TCCCAACGCA GGCAAGTCTG 108 0 

3 0 GAAGAAAGTC AATATGCACT GGCCAATTAC TTTATGAAGT CCCCTTTGGG AGTGTGGGCA 114 0 



ATTTGTGACC AGAAAAATCA ACAAATGATT GGTTCTATTA AATTTGAGAA GTTAGATGAA 1200 



ATCAAAAAAG AAGCTGAGCT TGGCTATTTT TTGAGAAAAG ATGCTTGGTC GCAAGGATTT 1260 

35 

ATGACAGAGG TTGTTAGAAA AATTTGTCAG CTTTCTTTTG AGGAATTTGG CTTAAAACAA 132 0 

TTATCTATCA TTACCCACCT TGAAAATGAA GCTAGCCAAA GAGTTGCTCT TAAGTCTGGA 138 0 

40 TTTAGTTTGT TCCGTCAGT 139 9 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 0 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

55 



!xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

60 

AGATTGCTCT TGAACACGAT GAAATACCAA TTGGTTGTGT GATTGTCAAA GATGGGAAAA 60 
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TCATTGGTCG TGGGCATAAT GCGCGTGAGG AATTACAGCG ACGGTTATGC ATGCGGAAAT 120 

TATGGCTATA GAGGATGCGA ACTTGAGTGC AGGAGACTGG CGCTTGCTGG ATTGCACACT 180 

TTTTGTGACC ATTGAACCAT GTGTCATGTG TAGTGGGGCG ATTGGGCTTG CCCGTATTCC 24 0 

AAATGTGGTC TATGGGGCTA AAAAC CAGAA ATTTGGCGCT GCTGGAAGTT TGTACGATAT 3 00 

CTTGACAGAT GAGCGTCTTA AC CAT C GT GT AGAGGTTGAA ACGGGAATTT TGGAAGATGA 3 60 

ATGTGCAGCT ATCATGCAGG ACTTTTTTAG AAATAGACGG AAAAAATAAT TTTGCTTTTA 420 

AAATGAATAG GAATGTGATA TAATAAATAG TGGAGCAACA GTTCTGCGTG AAGCGGGTCA 4 80 

GGGGAGGAAT CCCAGCAGCC CTAAGCGATT TGAATTGTGT GCTCTTTTTT TCGTGCTTTT 54 0 

TTCCGAATAA ATAAGATAGA ATAATCTAGA ATAAATGATA ATAGAAAAGA GAAAATTATG 600 

AAAATTCGTG GTTTTGAATT GGTTTCGAGT T TTACAGAT G AAAATTTATT GCCCAAGCGT 660 

GAGACAGCGC ATGCGGCTGG TTACGACTTA AAGGTTGCTG TGCGTACAGT TGTTGCGCCA 72 0 

GGAGAGATTG TCTTGGTTCC GACAGGGGTT AAGGCTTATA TGCAGCCGAC TGAGGTTCTC 780 

TACCTCTATG ATCGTTCTTC AAATCCTCGT AAGAAGGGCT TGGTTTTAAT TAACTCAGTT 84 0 

GGGGTCATTG ATGGGGATTA TTATGGAAAT CCTGGAAATG AAGGGCATAT TTTTGCGCAG 900 

AT GAAGAATA TCACAGACCA AGAGGTT GTT CTTGAAGTTG GGGAGCGTAT TGTCCAGGCT 9 60 

GTTTTTGCTA CTTTCTTAAT TGCAGATGGA GATGCAGCTG ATGGCGTTCG AACTGGTGGA 1020 

TTTGGATCGA CAGGGCACTA GAATGAAGAT TATCTTTGTA CGTCATGGGG AGCCAGATTA 10 8 0 

CCGTGAGTTA GAGGAGCGTT CTTATATAGG ATTT GGGATA GATTTGGCAC CCTTGTCTGA 114 0 

GATGGGACGG CAGCAAGTCC AGAAATTGAG CAAAAATCCT TTACTCTCGT CAGCTGAAAT 12 0 0 

AATCGTATCT TCTGCAGTCA CAAGAGCTTT AGAAACGGCT TCGTATGTGG TCTGTGCTAC 12 60 

GGGTCTTCCT TTAAGAGTAG AGCCTTTATT ACATGAATGG CAGGTCTATA AAACAGGAAT 132 0 

AGAAAACTTT GAAACAGCTA GAAGACT GTT TTTAGAAAAC AAGGGGGAGT TGCTTCCTAA 138 0 

TAGTCCTATT CAATATGAGA CAGCTACGGA AATGAAGTCT CGGTTTCTAG AATGTATGTC 14 4 0 

TAAGTATCGA GAACAT CAG A CTGTGGTAGT TGTTGCTCAT CGACTCTAGA GGAGCCAGTT 15 00 

TGTGCCAAAT GAGAAGATTG ATTTTTGCCA AGTGATTGAG TGTGAGTTAG AGATATAGAA 1560 

AGAGGTTTGT CATCGCAAAG AAAAAAGCGA CATTTGTATG TCAAAATTGT GGGTATAATT 162 0 

CCCCTAAATA TCTGGGACGT TGCCCCAACT GTGGGTCTTG GTCTTCTTTT GTGGAAGAGG 1680 

TTGAGGTTGC CGAAGTTAAG AATGCGCGTG TGTCCTTGAC AG GT GAGAAA ACCAAGCCCA 17 4 0 

TGAAACTAGC TGAGGTGACT TCCATCAATG TCAATCGAC 1779 
(2) INFORMATION FOR SEQ ID NO: 20: 



PCT/US97/22578 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3725 base pairs 
IB! TYFE: nucleic ac.d 
(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

10 

(iv) ANTI-SENSE: NO 



15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

GCGGATCCTC TAGAGTCGAA AGATTACGAA GGTAAGAACC CTCTTTATTA CTGGTCACAT 60 

2 0 CAT G GTAC AA CAAGCTGCAA CAGTATCTCT TATGGTTCTA TTCTTAGTAC CACAATTGCG 12 0 

CAATGCTTAC GGTACAGCAG CGATTGGTAT CATCTGTGGA CTTTACTGGG CAGTTAGTTC 180 

AAATAT GACT GTTGAGGCAA CTCAACGCTT GACTGGTGGT GGCGGATTTG CGATTGGTCA 2 4 0 

25 

CCAACAGCAA TTTGCAATCT GGTTTGTAGA TAAAGTAGCA GGACGCTTTG GTAAGAAAGA 3 00 

AGAAAGTTTA GACAAT CTTA AATTACCTAA GTTCCTCTCA ATCTTCCACG ATACAGTTGT 3 60 

3 0 TGCATCTGCT ACCTTGATGC TCGTATTCTT CGGGGCCATT CTTTTAATCT TGGGTCCAGA 42 0 

CAT TAT GT C T AATAAAGAAG TCATCACTTC AGGAACTCTA TTCAATCCTG CTAAACAAGA 4 80 

TTTCTTTATG TACATTATCC AAACAGCCTT TACCTTCTCA GTTTACTTGT TCGTTTTGAT 54 0 

35 

GCAAGGTGTC CGAATGTTCG TATCTGAGTT AACAAACGCT TTCCAAGGTA TTTCAAACAA 600 

ATTGTTGCCA GGTTCATTCC CAGCGGTTGA CGTTGCAGCT T CT TAT G GAT TTGGTTCTCC 6 60 

4 0 AAATGCTGTC TTGTCAGGAT TTACCTTTGG TTTGATTGGT CAATTGATTA CAATTGTCTT 720 

GCTCATCGTC TTTAAAAATC CGATTCTTAT TATTACAGGA TTTGTACCAG TGTTCTTTGA 7 80 

CAATGCAGCC ATTGCGGTCT ACGCTGATAA ACGCGGCGGA TGGAAAGCGG CTGTTATCCT 840 

45 

TTCCTTTATA TCAGGTGTCC TTCAAGTTGC TCTAGGAGCT CTTTGTGTGG CCCTTCTCGA 900 

TTTGGCATCT TATGGTGGCT ACCATGGAAA TATCGACTTT GAATTCCCAT GGCTTGGATT 9 60 

5 0 T G GAT ATAT C TTCAAATACC TTGGTATTGT TGGTTATGTA CTTGT GTGTC TCTTCTTGCT 1020 

TGTTATTCCT CAACTTCAAT TTGCCAAAGC AAAAGATAAA GAGAAATATT ACAACGGTGA 108 0 

AGTT CAAGAA GAAGCTTAGT ATCTAGAAAA GGAGAAATAA AATGGTTAAA G TAT TAG CAG 114 0 

55 

CGTGCGGAAA TGGAATGGGT T CAT CAAT GG TTAT CAAGAT GAAGGTTGAA AATGCTCTCC 12 00 

GTAAGCTTAA TCAAACAGAT TTTACAGTCA ATTCATGCAG TGTCGGTGAA GCTAAAGGTT 12 60 

60 TAG CAGT AG G ATAT GACAT C GTAATCGCTT CTCTTCATTT GATT CAAGAA TTGGAAGGGC 1320 
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GAACTAATGG GAAG TTAATT GGACTTGATA 
AACTCAGTCA AGCACTACAG TAAAAGGTTG 
5 GTTTCTGTCC TTCTCCCTCT TTAAATAAAG 
ATTGACAATG ACTCGATCCG ACTAGGTTTA 
GTAGCAGTAG ATCCCTTAAT TGAAAGTGGG 

10 

ATTGAATCGA C T GAAGAGTA TGGGCCTTAC 
CACGCTAGAC CTGAAGCTGG TGTGCAAAGT 
15 CCTGTTGTAT TTTCAGATGG GAAAGAGGTA 
TCAAAAATTC ACACAAGTGT AGCCATTCCA 
TCTATTGCAC GTTTACAGG C TTGCCAGACT 

20 

TCTAAGGATA GCCCTTATCT CGAAGGATTG 
AATGACAAAA AGAATAC CT A ATTTACAAGT 
2 5 G ATTAAAG C A GCTGTTTCTG TTGGTCAGGA 
CTTGCTTCAA GTTGGAAGTG AACTGGCTGA 
TATTGTGGCA GACACAAAAT GTGCTGATGC 

30 

TCGTGGAGCA GACTGGATGA CTTGTATCTG 
TCTAAAGGCT ATCAAGACTG AACGAGGAGA 
35 CGATTGGACT TTTGAACAAG CTCAGCTTTG 
TCACCAATCT CGTGATGCTC TTCTTGCTGG 
GGTTAAAAAA CTCATTGACA TGGGCTTCCG 

40 

TACTCTCAAA CTCTTTGAAG GTGTTGATGT 
AGAGGCTGCG GAT C C AG CAG GAGCAGCGCG 
45 GGGGTAAATC ATGGTACGTC CAATTGGAAT 
TTGGCTAGAA CGTTTAAATT TTGCCAAGGA 
TGACGAACGT GACGAGCGTT TAG CAAGACT 

50 

TGTCAAAGCA ATCTATGAAA CTGGTGTTCG 
TCGCTACCCA TTGGGTTCAA AAGATCCAGT 
55 AAAATGTATC GAATT AG C T C AAGACTTGGG 
TGTTTACTAT GAGGAAAAGT CACCCCAGAC 
AGCCTGTGAC TGGGCTGAAG AAGCTCAGGT 

60 

TTTCATCAAT AG CAT C GAAA AATATTTGGC 



-60- 

ACTTGATGGA T G ATAAAGAA ATCACCGAAA 13 80 

GAGGGGGCTG GACAGAAACT GAGAGTTATC 14 4 0 

GAGGCAGATA TGAATTTAAA ACAAGCTTTA 15 00 

GAGGCTAACA ATT G GAAA GA AGCAGTCAAG 15 60 

GCAATTTTGC CAGAGTATTA CGATGCTATC 1620 

TAT AT C TT G A TGCCAGGTAT GGCTATGCCC 168 0 

GATGCCTTTT CATTGATTAC CTTACAAAAT 17 4 0 

TCTGTTTTGT TGGCACTAGC AGCAACAAGC 18 00 

CAAATTATTG CCCTGTTTGA AT T AGAAGAT 1860 

AAAGAAGATG TCTTGGCTAT GATTGAAGAA 192 0 

GATTTGGAAA GTTAGAAAGA GGAATAAAGA 19 8 0 

TGCATTAGAC CATTCAGACT TGCAAGGAGC 2 04 0 

AGTAGATATT ATCGAAGCTG GAACTGTTTG 2100 

AGTCTTGCGT AGCCTTTTCC CAGATAAGAT 2160 

TGGTGGAACA GTTGCTAAAA ATAATGCGGT 2220 

TTGTGCAACC ATCCCTACTA TGGAAGCAGC 22 8 0 

ACGAGGCGAA AT C CAG AT C G AGCTTTATGG 234 0 

GCTAGATGCA GGTATTTCAC AAGCTATTTA 2 4 00 

TGAAACTTGG GGTGAAAAAG AC CTTAAT AA 24 6 0 

TGTATCTGTA ACAGGTGGTC TAGAT GTAGA 252 0 

CTTTACCTTT ATCGCAGGTC GTGGAATTAC 258 0 

TGCCTTCAAG GAT GAAAT CA AACGAATTTG 264 0 

TTATGAAAAG GCAACCCCAA CACACTTTAC 2700 

GTTAGGCTTT GATTTTGTCG AGATGTCTAT 27 60 

TGACTGGAGT AAG GAAGAAC GCTTGGAAGT 2 82 0 

TATTCCTTCT ATCTGTTTTT CAGGCCATCG 288 0 

TCTAGAGGAA AAATCTCTAG AACTCATGAA 2 94 0 

AGTTCGTACG ATTCAATTAG CTGGTTACGA 300 0 

ACGCCAACGT TTTATCAAAA ATTTGAGAAA 3060 

GGTACTTGCT ATTGAAATTA TGGATGATCC 312 0 

TATAGAAAAA GAGATTGACT CTCCCTTCCT 318 0 
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3725 



-61- 



CTTTGTATAT CCAGATATTG GTAATGTGTC TGCATGGCAT AATGATATCT ATAGTGAGTT 32 4 0 

TTATCTTGGT CAT CAT GC CA TCGCAGCTCT CCATCTCAAG GAT A C T TAT G CAGTGACAGA 330 0 

AAGTTCAAAG GGCCAGTTCC GAGATGTACC TTTCGGGCAA GGTTGTGTCA AATGGGAAGA 336 0 

AGCTTTCGAT ATTTTAAAGG AAAC CAATTA TAATGGACCT TTCCTAATCG AAATGTGGTC 342 0 

TGAAAATTGT GAAACAGTAG AAGAAACACG CGCAGCCGTT CAAGAGGCGC AAGCTTTTCT 34 80 

CTATCCACTC AT T AAGAAAG CAGGTTTGAT GTAAGATGAA TCAAGTAATC AATGCTATGC 3540 

GTAAAC GAGT CTGTGATGCC AATCAATCAT TGCCAAAACA TGGACTTGTC AAATTTAC CT 3 600 

GGGGGAATGT ATCTGAAGTT AATCGCGAAC TCGGTGTCAT TGTTATCAAA CCATCAGGCG 3 660 

TGGATTATGA CGAATTGACA CCTGAAAACA TGGTAGTGAC TGATCTAGAT GGTAAGATCC 372 0 
CCGGT 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2483 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
TCTAGAATCA TTTCCCAGCA GTTGGCTCAG GAAGTCGCAA TTATCTGGGT GAGTTTTCAG 
CGTGTTGGAC TGAAGTGGAG ATTGATGAGA TATACCGCGC CTTTGTCATG GCACATTTCA 
AGAGTTCGCG TCCAGATGCC CAGACCTTGA TTTT CTATAC CCACTATGAC ACTGTGCCAG 
CGGATGGGGA TCAGGTCTGG ACAGAGGATC CTTTTACGCT TTCGGTCCGC AAT GGC T CAT 24 0 

GTATGGGCGT GGGGTTGATG ACGACAGGGT CATATCACAG CTCGCTTGAG TGCTTGAGAA 300 
AATATATGCA GCCCTGATGA TTACCTGTCA ATATCAGCTT TAT CAT G GAG GGAGCGGAGG 360 
AATCGGCTTC AACAGACCTA GATAAGTATT TGGAAAAGCA T G CAGACAAA CTCCGTGGGG 420 
CGGATTTGTT GGTCTGGGAA CAAGGGACCA AAAATGCCTT GGAACAGCTG GAAATTTCTG 4 80 

GTGGCAATAA GGGGATTGTG ACCTTTGATG CCAAGGTAAA AAGCGCTGAT GTGGATATCC 54 0 

ACTCGAGTTA TGGTGGTGTT GTGGAATCAG CTCCTTGGTA TCTCCTCCAA GCCTTACAGT 



120 
180 



600 



CTCTTCGTGC TGCGGATGGC CGTATCTTGG TTGAAGGCTT GTACGAAGAA GTACAAGAGC 660 
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CCAATGAACG AGAAATGGCC TTGCTAGAAA 
GTCGGATTTA TGGATTGGAG TTGCCTCTCT 

5 

GTTTCTTTTT CGAGCCAGCG CTTAATATCG 
GTGTTAAGAC TATTTTGCCT GCAGAAGCCA 
10 GCCTAGAACC G CAT GAT GTT CTGGAAAAAA 
ATAAGGTAGA ATTATACTAT ACCTTGGGAG 
CCAGCCATTC TCAATGTGAT CGAGTTGGCC 

15 

TTGCCGACGA CAGCGGGGAC AGGACCTATG 
ATGGTTGCAT TCGGTCTAGG AAATGCCAAT 
2C CGAATCGCTG ATTATTACAC CCATATCGAA 
TAGAGATATT ATCAAGTTAG AT CAGAT CGA 
CACAGCGGTT AAGGATGTGA CCATTCACAT 

25 

ATATTCTGGA GCAGGGAAAT CAACCCTTGT 
TGCAGGGAAA ATTACCATTG ACGACGATGT 

3 0 AGAGCAGTTG CGTCGTAAAC GTCAAGATAT 

GAGCCAAAAG ACAGCAGAGG AGAAT GTAGC 
GGAAGAAAAG AAGGCTAAAG TAGCTAAGTT 

35 

T GAAAACT AC CCTTCACAAC TATCTGGAGG 
CTTGGCCAAT GATCCAAAAA TCTTGATTTC 

4 0 GACAAC C AAG CAGATTTTGG CCTTGTTGCA 

TGTCTTGATT ACGCATGAAA TGAGATTGTC 
CAGGATGGGC ATTTGATTGA AGAGGGAAGT 

45 

CCTTTGACTC AAGACTTTAT CTCAACAGCC 
GAGAAGCAAG AAATCGTGGA ACACTTGTCT 

5 0 CGCTGGAGCT T CAACAGAC G AGCCACTTTT 

GGCTAATATT CTCTATGGGA AT AT C GAAAT 
TGGTGGTCTT GTCAGGTGAA AAAGCAGCGT 

55 

CAGGTGTACA AC TAAAAGTA TTGAAGGGAG 
TATTTACCAA ATGTCTATAA AATGGGTTGG 
60 ACTTAACTCT TTATATGCAG TTCTTTCCTT 



CTTATGGTCA AC GAAAC C CA GAGGAAGTTA 72 0 

TACAGGAGGA GCGGATGGCC TTTCTAAAAC 7 80 

AAGGAATCCA GTCTGGTTAT CAAGGTCAGG 84 0 

GTGCCAAGCT AGAGGTTCGT CTGGTTCCGG 9 00 

TTCGGAAACA GCTAGACAAA AATGGCTTTG 9 60 

AGATACTAGA GTCGAAGCGA TATGAGCGCA 1020 

AAGAAATTCT ATCCACAGGG CGTTTCAGTC 1080 

CATACGGTCT TTGATGCCCT AGAGGTACCA 1140 

AGCCGAGACC ACGGTGGAGA TGAAAATGTG 12 0 0 

TTAGTAGAGG AGCTGATTAG AAG CTAT GAG 12 6 0 

TGTGACTTTT CACCAAAAGA AGAGAACCAT 132 0 

CCAAGAAGGG GATATCTACG GAATCGTTGG 1380 

ACGGGTGATT AACCTCTTGC AAAAAC CAT C 144 0 

GATTTTTGAC GGCAAGGTGA CCTTGACGGC 15 00 

CGGGATGATT TTCCAGCATT TTAACCTGAT 1560 

CTTTGCCCTT AAACACTCTG GACTCAGCAA 162 0 

GTTGGACTTG GTTGGTTTGG CAGAT CGTGC 1680 

GCAAAAACAG CGTGTGGCAA TTGCGCGTGC 17 4 0 

AGACGAGTCA ACTTCTGCCC TTGACCCTAA 1800 

AGATTTGAAC CAAAAATTAG GAT T GACAGT 1860 

AAAGACATTG CCAACCGTGT GGCGGTTATG 192 0 

GTCCTTGAAA TCTTCTCAAA CCCTAAACAA 1980 

ACAGGTATTG ACGAAGCCAT GGT CAAAATC 2 04 0 

GAAAACAGTC TCTTGGTGCA ACTT CAAGTA 2100 

GAAT GAATT G TACAAGCATT ACCAAGTAAT 2160 

TCTCGATGGT ACTCCTGTTG GAGGAATTGG 222 0 

TGGCAGGTGC CCAAGAAGCC ATTCGTCAAG 22 8 0 

TACAGTAAGA TGGAATCATT GATTCAAACC 2 34 0 

GCT GTCAGGC AGGCTGGGGG ACGGCTATCT 24 0 0 

CATTATTCGG GTTCTTGGGG CTAGTGGCAG 2460 
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GTCTTCTCGT CTTAAGCGCC AGT 

(2! INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEBNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

CCAATTAATG TGAGTTAGCT CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG 60 

GCTCGTATGT TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 12 0 

CAT GATT AC G CCAAGCTTGC ATGCCTGCAG GTCGACTCTA GAGGAT CCAA GCCATAGTTA 18 0 

GACATGACTG CCAAATCTAA GGTTTGAGCA GTTGTTAAAT AAGCATTAGC TGTCGCCTCC 24 0 

ATGTTGGGAC TGGTTACTTT GAGGC CTACT AAGGCTAGAG AT C C C AA CAT CATCAGGATC 30 0 

AAGATGGATA AAAAACGCCC TTGGAGCCTG TGAAGGACTG AATTAAGTCC TTCCGAATAA 3 60 

GTTTTTCGCT TGAT CATGCT AGTACTCCAA ACTGTCAATA TCCTGAGGAT GCTGGTTGAG 42 0 

CACCACATCC TTGACACTGG CATCGTGCAT TTGAATCACG CGAT CAGCAA TGGGCGCCAA 4 80 

AGCTCCATTA TGAGTCACGA TGATCACCGT CGCTCCCTTT T G AC GAGACA TGTCTTGGAG 54 0 

AATTTTCAAA ACCTGCTTGC CCGTCTGATA ATCCAAGGCT CCAGTCGGTT CAT CA C AAAG 60 0 

GAGAATTTTA GGATTTTTGG CTACCGCGCG TGCAATGGAG ACTCGCTGTT GCTCCCCTCC 660 

AGAAAGCTGG GCTGGAAAGT TATTTAGACG AT GAG C C AGA CCTACATCTG TCAAGACCTG 72 0 

AT CAGAATT C AAGGCATCTG TCACAATTTC AGAAGCAGTT CCACATTTTC CTTAGCTGTC 7 80 

AGATTAGAAA CTAGATTATA AAACTGAAAA ACAAACCCCA CAT CATTT CT ACGGTAATTG 840 

GTGCGCTGGT GGGAACTATA ATCCGCAATA TTAACAC CAT CAATCCAGAT TTCCCCTTCA 9 00 

TCATTGGTAT CCATTCCCCC AAGAAGGTTA AGAACTGTTG ACTTGCCTGC ACCTGAAGCA 9 60 

CCAAGGATAA TAACCAGTTC CCCCTTTTCA ATCTCAAAAT T C AC AT CA C G 1010 
(2) INFORMATION FOR SEQ ID NO: 23: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1299 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DKA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

TCGATCGCAC CGTCCTCTCC TCGTTCTGCT CTTGCTGGGC TATAGTTCCC TCTTCTAGTC 60 

TTGATTTTCT TTGCCCATGC GTTCTTACCA CTTCTACTGT TTGCAGGTTT TACATGTCTG 12 0 

GATATACTAT TTGTGCTAGG CTTAGCTTCT AGGATGGAGA AAAGAAGTCT AGTAGAGTTA 180 

TTGAAAGGGG GCATCTTATG ATTGAGTTGA AAAATATTAC CAAAACCATT GGGGGAAAAG 24 0 

TGATTTTGGA TAACTTATCT CTCAGGATTG ATCAGGGGGA TTTGGTAGCT ATTGTTGGTA 30 0 

AGAGTGGTAG TGGGAAGTCG ACCTTGTTAA ATTTATTGGG TTT GATAGAT GGT GATTATA 360 

GCGGACGGTA TGAGATTTTT GGTCAGACAA ATCTAGCGGT TAATTCTGCT AAGTCGCAAA 42 0 

CAATAATCCG TGAACATATC TCTTATCTGT TTCAAAATTT TGCCCTGATT GATGATGAAA 48 0 

CGGTCGAGTA CAATCTCATG CTGGCGCTGA AATATGTGAA ATTGCCTAAG AAAGACAAGC 54 0 

TCAAAAAGGT GGAAGAGATT TTAGAGAGAG TAGGTTTGTC AGCTACTTTG CAT CAAAGGG 60 0 

TCTCCGAGTT GTCTGGGGGC GAACAACAAC GAATTGCAGT TGCTAGAGCC AT C T TAAAAC 66 0 

CCAGCCAGCT GATTTTAGCC GATGAACCTA CAGGTTCGCT GGATCCTGAA AATAGAGATT 72 0 

TGGTCTTGAA GTTTCTCTTA GAGATGAATC GAGAAGGGAA AACAGTCATT ATTGTGACCC 78 0 

ACGATGCTTA TGTAGCCCAA CAATGTCATC GTGTCATTGA ATTGGGCGAG GGAAAATGAG 84 0 

TTCATTCAGC TCCTTTTGAC TGGCTGAATA CTCATGTTTT CCAGAGAAAA ATAGCATAAA 900 

TACGCCTAGG AATGACATTT TATGTAGCAT TTCTAGGTTT TTTTGTTTCA AATTGAAAAT 960 

TTTTTCAATT TAGGCTTGAC AAAGGATGAG TATAGGAGTA TTATTTATAC AATAAAAAAG 102 0 

AATAAACATA AAGAAGG CTT TGTTATGAAT AAGAT GAAGA AGGTGTTGAT GACGATGTTT 108 0 

GGTTTAGTGA TGCTCCCCCT ACTATTTGCT TGTAGTAACA ATCAATCGGC TGGAATTGAA 114 0 

GCCATCAAGT CCAAAGGAAA ATTGGTTGTA GCCCTCAATC CAGATTTTGC TCCATTTGAA 12 0 0 

TAT CAAAAAG TGGTTGATGG GAAAAATCAG ATTGTGGGTT CAGATATCGA CTTAGCCAAG 12 60 

CTATCGCAAC AGAACTAGGT GTCGACTCTA GAGGATCCC 12 99 

(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
!D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (genomic) 

(lii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

GCCAAACCAA AACAACTCTC AGGTGGTCAA AAACAACGTG TGGCCATCGC TCGTGCCCTC 60 

TCCATGAATC CGGACGCTAT TCTCTTTGAT GAACCAACAT CAGCTCTCGA TCCAGAAATG 120 

GTTGGAGAAG TCCTCAAAAT CATGCAGGAC CTGGCTCAGG AAGGCTTGAC CATGATTGTC 18 0 

GTAACCCATG AAATGGAATT TGCCCGTGAT GTCTCTCACC GTGTTATCTT TATGGATAAG 24 0 

GGCGTGATCC CC 2 52 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 455 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

GATTTGGTAC CTCTTCGCAA GGAAGTCGGC ATGGTTTTTC AACATTTTAA CCTTTATCCA 60 

CACAAAACGG TGTTAGAAAA CGTGACACTT GCGCCCATTA AAGTTCTAGG AAT T GAT AAA 12 0 

AAAGAAGCTG AAAAAACCGC CCAAAAATAT CTGGAATTTG TAAATATGTG AGACAAGAAA 180 

GATTCCTATC CCGCCATGCT ATCTGGTGGA CAAAAACAGC GGATCGCCAT CGCTCGTGGT 24 0 

CTTGCTATGC ATCCGGAACT CCTCCTCTTT GATGAACCAA CATCTGCTCT TGATCCTGAG 300 

ACTATCGGAG ATGTTCTAGC AGTTAT GCAG AAACTGGCGC AT G AT G G GAT GAACATGATC 360 

ATCGTTACCC ACGAAATGGG CTTTGCTCGA GAGGTTGCGG AC C G CAT TAT CTTTATGGCC 42 0 

GACGGAGAAG TTTTAGTAGA TACGACAGAT GTCGA 4 55 
(2) INFORMATION FOR SEQ ID NO: 26: 
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(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 913 base pair 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DKA (genomi 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

AAACACTGCT TCTTGAGCGA ATGACGCTTT GTCCTTTTAA TGAGGTTACC AACGGCTTCA 60 

AAGAGGATTC CCAGCTCGTT CAGCTGTGGA GGTAGCTCGT CTTCCTCGTG ATGTAAAAGT 12 0 

CGAAATTGAA GTCATCGCAG AGATTGGATA AGCTAGTTGA AGTTTGGTGT TGCCAAACTT 18 0 

CTTTTGATAT AAGGAGAAAA AGATGACAAA GAAACAACTT CACTTGGTGA TTGTGACAGG 24 0 

GATGGGTGGC GCAGGGAAAA CT GT AG C CAT TCAGTCCTTC GAGGATCTAG GTTATTTCAC 30 0 

CATT GATAAT ATGCCGCCAG CTCTCTTGCC TAAGTTTTTG CAGCTGGTTG AAATTAAGGA 3 60 

AGACAAT C CT AAGTTGGCCT TGGTAGTGGA TATGCGTAGT CGTTCTTTCT TTTCAGAGAT 42 0 

TCAAGCTGTT TTGGATGAGT TGGAAAATCA AGATGGTTTG GATTTCAAAA TCCTCTTTTT 480 

GGATGCGGCT GATAAGGAAT TGGTCGCTCG TTACAAGGAA ACCAGACGGA GTCACCCACT 54 0 

AG CAG CAGAC GGTCGTATTT TAGAT GGAAT CAAGTTGGAA CGTGAACTCT TGGCACCTTT 600 

GAAAAATATG AGCCAAAATG TGGTGGATAC GACTGAACTC ACTCCACGTG AGCTGCGCAA 660 

AACCCTTGCA GAGCAGTTTT CAGACCAAGA ACAAGCTCAG TCTTTCCGTA TCGAAGTCAT 72 0 

GT CTTTCGGA TTTAAGTATG GAATCCCGAT TGATGCGGAC TTGGTCTTTG ATGTCCGTTT 780 

CTTGCCAAAT CCCTATTATT TACCAGAACT GAGAAACCAA ACGGGTGTGG ATGAACCTGT 84 0 

TTAT GAT TAT G T CAT GAACC ATCCTGAGTC AGAAGACTTT TATCAACATT TATTGGCCTT 90 0 

GATTGAGCCG ATT 913 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5919 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

TCGATTCGTG GAGCAGGAAA TCTTTTAGGA AAATCCCAGT CTGGTTTCAT TGATTCTGTT 60 

GGTTTTGAAT TGTATTCGCA GTTATTAGAG GAAGCTATTG CTAAACGAAA CGGTAATGCT 12 0 

AACGCTAACA CAAGAACCAA AGGGAATGCT GAGTTGATTT TGCAAATTGA TGCCTATCTT 180 

CCTGATACTT ATATTTCTGA TCAACGACAT AAGATTGAAA TTTACAAGAA AATTCGTCAA 240 

ATTGACAACC GTGTCAATTA T GAAGAGTT A CAAGAGGAGT TGATAGACCG TTTTGGAGAA 300 

TACCCAGATG TAGTAGCCTA TCTTTTAGAG ATTGGTTTGG TCAAATCATA CTTGGACAAG 360 

GTCTTTGTTC AACGTGTGGA AAGAAAAGAT AATAAAATTA CAATTCAATT TGAAAAAGTC 42 0 

ACTCAACGAC TGTTTTTAGC TCAAGATTAT TTTAAAGCTT TATCCGTAAC GAACTTAAAA 4 80 

GCAGGCATCG CTGAGAATAA GGGATTAATG GAGCTTGTAT TTGATGTCCA AAATAAGAAA 54 0 

GATTATGAAA TTTTAGAAGG TCTGCTGATT TTTGGAGAAA GTTTATTAGA GATAAAAGAG 600 

TCTAAGGAAA AAAATTCCAT TTGATATTTT TCTTCTATAA AATAGATAAA AT GGTACAAT 660 

AATAAATTGA GGTAATAAGG AT GAGATTAG ATAAATATTT AAAAGTATCG CGAATTATCA 72 0 

AGCGTCGTAC AGTCGCAAAG GAAGTAGCAG ATAAAGGTAG AATCAAGGTT AATGGAATCT 7 80 

TGGCCAAAAG TTCAACGGAC TT GAAAGTT A ATGACCAAGT GAAATCGCTT GGCAATAAGT 84 0 

TGCTGCTTGT AAAGGTACTA GAGATGAAAG ATAGTACAAA AAAAGAAGAT GCAGCAGGAA 900 

TGTATGAAAT TATCAGTGAA ACACGGGTAG AAGAAAATGT CTAAAAATAT TGTACAATTG 960 

AATAATTCTT TTATT CAAAA TGAATACCAA CGTCGTCGCT ACCTGATGAA AGAACGACAA 1020 

AAACGGAATC GTTTTATGGG AGGGGTATTG ATTTTGATTA T G CT AT T ATT TATCTTGCCA 1080 

ACTTTTAATT TAGCGCAGAG TTATCAGCAA TTACT C C AAA GACGTCAGCA ATTAGCAGAC 114 0 

TTGCAAACTC AGTATCAAAC TTTGAGTGAT GAAAAGGATA AGGAGACAGC ATTTGCTACC 12 00 

AAGTTGAAAG ATGAAGATTA TGCTGCTAAA TATACACGAG C GAAGT ACT A TTATT CTAAG 1260 

TCGAGGGAAA AAGTTTATAC GATTCCTGAC TTGCTTCAAA GGT GATAAAA TGGAAAATTT 132 0 

ATTAG AC GT A ATAGAGCAAT TTTTGAGTTT GTCAGATGAA AAGCTGGAAG AATTGGCTGA 13 80 

TAAAAATCAA TTATTGCGTT TACAAGAAGA AAAGGAAAGG AAGAATGCGT AAATTCTTAA 14 40 

TTATTTTGTT GCTACCAAGT TTTTTGACCA TTTCAAAAGT CGTTAGCACA GAAAAAGAAG 15 00 

TCGTCTATAC TTCGAAAGAA ATTTATTACC TTTCACAATC TGACTTTGGT ATTTATTTTA 15 60 

GAGAAAAATT AAGTTCTCCC ATGGTTTATG GAGAGGTTCC TGTTTATGCG AATGAAGATT 1620 
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TAGTAGTGGA ATCTGGGAAA TTGACTCCCA AAACAAGTTT TCAAATAACC GAGTGGCGCT 1680 

TAAATAAACA AGGAATTCCA GTATTTAAGC TATCAAATCA TCAATTTATA GCTGCGGACA 17 4 0 

5 

AACGATTTTT AT AT GAT CAA TCAGAGGTAA CTCCAACAAT AAAAAAAGTA TGGTTAGAAT 1B00 

CTGACTTTAA ACTGTACAAT AGTCCTTATG ATTTAAAAGA AGTGAAATCA TCCTTATCAG 18 60 

10 CTTATTCGCA AGTATCAATC GACAAGAC C A TGTTTGTAGA AGGAAGAGAA TTTCTACATA 1920 

TTGATCAGGC TGGATGGGTA GCTAAAGAAT CAACTTCTGA AGAAGATAAT CGGAT GAGTA 1980 

AAGTTCAAGA AATGTTATCT GAAAAATATC AGAAAGATTC TTTCTCTATT TATGTTAAGC 2040 

15 

AACTGACTAC TGGAAAAGAA GCTGGTATCA ATCAAGATGA AAAGATGTAT GCAGCCAGCG 2100 

TTTTGAAACT CTCTTATCTC TATTATACGC AAGAAAAAAA TAAATGAGGG TCTTTATCAG 2160 

2 0 TTAGATACGA CTGTAAAATA CGTATCTGCA GTCAATGATT TTCCAGGTTC TTATAAACCA 222 0 

GAGGGAAGTG GTAGTCTTCC TAAAAAAGAA GATAATAAAG AATATTCTTT AAAGGATTTA 2 2 80 

ATTAC G AAAG TATCAAAAGA ATCTGATAAT GTAGCTCATA ATCTATTGGG ATATTACATT 2 34 0 

25 

TCAAACCAAT CTGATGCCAC ATTCAAATCC AAGATGTCTG CCATTATGGG AGATGATTGG 2 4 00 

GATCCAAAAG AAAAATTGAT TTCTTCTAAG ATGGCCGGGA AGTTTATGGA AG CTATTTAT 2 4 60 

3 0 AATCAAAATG GATTTGTGCT AGAGTCTTTG ACTAAAACAG ATTTT GATAG TCAGCGAATT 2 520 

GCCAAAGGTG TTTCTGTTAA AGTAGCTCAT AAAATTGGAG AT GCGGATGG ATTTAAGCAT 2580 

GATACGGGTG TTGTCTATGC AGATCCTCCA TTTATTCTTT CTATTTTCAC TAAGAATTCT 2 64 0 

35 

GATTATGATA CGATTTCTAA GATAGCCAAG GAT GTTTAT G AGGTTCTAAA AT G AG G GAAC 2 7 00 

CAGATTTTTT AAATCATTTT CTCAAGAAGG GATATTTCAA AAAGCATGCT AAGGCGGTTC 2 7 60 

4 0 TAGCTCTTTC TGGTGGATTA GATT C CAT GT TTCTATTTAA GGTATTGTCT ACTTAT CAAA 2 82 0 

AAGAGTTAGA GATT GAATTG ATTCTAGCTC ATGTGAATCA TAAG CAGAGA ATT GAATCAG 2 88 0 

ATTGGGAAGA AAAGGAATTA AGGAAGTTGG CTG CTGAAGC AGAGCTTCCT ATTTATATCA 2 94 0 

45 

GCAATTTTTC AGGAGAATTT TCAGAAGCGC GTGCACGAAA TTTTCGTTAT GATTTTTTTC 3 000 

AAGAGGTCCA TGAAAAAGAC AGGTGCGACA GCTTTAGTCA CTGCCCACCA TGCTGATGAT 3 060 

50 CAGGTGGAAA CGATTTTTAT GCGCTTGATT CGAGGAACCT CCTTGCGCTA TCTATCAGGA 3120 

ATTAAGGAGA AGCAAGTAGT C G GAGA GAT A GAAATCATTC GTCCCTTCTT GCATTTTCAG 3180 

AAAAAAGACT TT C CAT CAAT TTTTCACTTT GAAGATACAT CAAATCAGGA GAAT CATTAT 324 0 

55 

TTTCGAAATC GTATT CGAAA TTCTTACTTA CCAGAATTGG AAAAAGAAAA TCCTCGATTT 3 300 

AGGGATGCAA TCCTTAGGCA TTGGCAATGA AATTTTAGAT TATGATTTGG CAAT AG C T GA 3360 

6 0 ATTATCTAAC AATATTAATG TGGAAGATTT ACAGCAGTTA TTTTCTTACT CTGAGTCTAC 3420 
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ACAAAGAGTT TTACTTCAAA CTTATCTGAA 
TCAGTTT3CT GAAGTTCAGC AGATTTTAAA 
5 AAATGGCTAT GAATTGATAA AAGAGT AC CA 
GGCTGATGAA AAGGAAGATG AACTTGT GTT 
ATATTTATTT TCTTTTGGAC TTCCATTAGA 

10 

ACGTGAAACA TCCATACACA TTCGTCATCG 
GCATAGAAAA AAACTCAGAC GTTTATTTAT 
15 CTCTGCTCTT ATTATTGAGC AATTTGGTGA 
TAATTTGAGT AAAAAAACGA AAAAT GAT AT 
AGATAG GTAA AAAAT GTTAG AAAAC GAT AT 

20 

TACAGAAG CA GCTAAAAAAC TAGGTGCTCA 
AATCTTAGTT GGGATTTTAA AAGGATCTAT 

2 5 TGATACACAT ATTGAAATGG ACTT CAT GAT 

TAGTGGTGTT ATCAATATTA AACAAGATGT 
ATT T GTAGAA GAT AT CATT G ATA GAG GT C A 

30 

AGAAAGAGAA GCAGCTTCTG TTAAAATTGC 
TGTAGAAATT GAGGCAGACT ATACCTGCTT 

3 5 TGGTTTAGAC TACAAAGAAA AT TAT C GTAA 

AGTGTATTCA AATTAGAAAG AATAAT CTTT 
AAATCCTTTT CTAT GGTTAT TATTTATCTT 

40 

ATTCTGGGAA TAACTCAGGA GGAAGTCAGC 
TTACCGATGG TAAT GTAAAA GAATTAACTT 

4 5 TGGTGTCTAT AAAAATCCTA AAACAAGTAA 

AT CT GTT ACT AAGGTAGAGA AATTTACCAG 
AGAATTGCAA AAACTTGCTA CTGACCATAA 

50 

AAGTGGTATA TGGATTAATC TACTCGTATC 
CCTATTCTCT ATGATGGGAA ATATGGGAGG 
55 ACGTAGTAAG GCTAAAGCAG CAAATAAAGA 
TGGAGCTGAG GAAGAAAAAC AAGAACTAGT 
AC GATT CACA AAACTTGGAG CCCGTATTCC 

60 

GACAGGTAAG ACTTTGCTTG CTAAGGCAGT 



-69- 

TCGTTTTCCA GATTTGAATC TTACAAAAGC 34 8 0 

ATTTAAAAGC CAGTATCGTC ATCCGATTAA 354 0 

ACAGTTTCAG ATTT GTAAAA TCAGTCCGCA 360 0 

ACACTATCAA AATCAGGTAG CTTATCAAGG 366 0 

AGGTGAATTA ATTCAACAAA TACCTGTTTC 372 0 

AAAAACAGGA GATGTTTTGA TTAAAAATGG 37 8 0 

TGATTTGAAA ATCCCTATGG AAAAGAGAAA 38 4 0 

AATTGTCTCA ATTTTGGGAA TTGCGACCAA 3 90 0 

AATGAACACT GTAC TT TATA TAGAAAAAAT 3960 

TAAAAAAGTC CTCGTTTCAC ACGATGAAAT 4 02 0 

ATTAACTAAA GACTATGCAG GAAAAAATCC 4 080 

TCCTTTTATG GCTGAATTGG TCAAACATAT 414 0 

GGTTTCTAGC TACCATGGTG GAACAGCAAG 420 0 

GACTCAAGAT ATCAAAGGAA GACATGTTCT 4260 

AACTTTGAAG AATTTGCGAG ATATGTTTAA 4 32 0 

AACCTTGTTG GATAAACCAG AAGGACGTGT 4 38 0 

TACTATCCCA AATGAGTTTG TAGTAGGTTA 4 44 0 

TCTTCCTTAT ATTGGAGTAT TGAAAGAGGA 4500 

AATGAAAAAA C AAAAT AAT G GTTTAATTAA 4 560 

TTTCCTTGTG ACAGGATTCC AGTATTTCCT 4 62 0 

AAAT CAACT A TACTGAGTTG GTACAAGAAA 4 680 

ACCAACCAAA TGGTAGTGTT TCGAAGTTTC 4 74 0 

AGAAGGAACA GGTATTCAGT TTTTCACGCC 4 800 

CACTATTCTT CCTGCAGATA CTACCGTATC 4 860 

AGCAGAAGTA ACT GT TAAG C AT GAAAGTT C 4 920 

CATTGTGCCA TTTGGAATTC TATTCTTCTT 4 98 0 

AGGCAATGGC CGTAATCCAA TGAGTTTTGG 5 04 0 

AGATATTAAA GTAAGATTTT CAGATGTTGC 5100 

TGAAGTTGTT GAGTTCTTAA AAGATCCAAA 5160 

AGCAGGTGTT CTTTTGGAGG GACCTCCGGG 522 0 

CGCTGGAGAA GCAGGTGTTC CATTCTTTAG 52 8 0 
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TATCTCAGGT TCTGACTTTG TAGAAATGTT TGTCGGAGTT GGAGCTAGTC GTGTTCGCTC 53 4 0 

TCTTTTTGAG GATGCCAAAA AAGCAGCACC AGCTATCATC TTTATCGACT GAAATGGATG 54 00 

CCCGTGGGAC GTCAACGTGG AGTCGGTCTC GGCGGAGGTA ATGACGAACG TGAACAAACC 54 60 

TTGAACCAAC TTTTGATTGA GATGGATGGT TTTGAGGGAA ATGAAGGGAT TAT C GT CATC 552 0 

GCTGCGACAA ACCGTTCAGA TGTACTTGAT CCTGCCCTTT TGCGTCCAGG ACGTTTTGAT 55 8 0 

AGAAAAGTAT TGGTTGGCCG TCCTGATGTT AAAGGTCGTG AAGCAATCTT GAAAGTTCAC 5 64 0 

GCTAAGAACA AGCCTTTAGC AGAAGAT GTT GATTTGAAAT TAGTGGCTCA ACAAACTCCA 5700 

GGCTTTGTTG GTGCTGATTT AGAGAAT GT C TTGAATGAAG CAGCTTTAGT TGCTGCTCGT 57 60 

CGCAATAAAT CGATAATTGA TGCTTCAGAT AT GAT G AAA G CAGAAGATAG AGTTATTGCT 582 0 

GGACCTTCTA AGAAAGATAA GACAGTTTCA CAAAAAGAAC GAGAATTGGT TGCTTACCAT 588 0 

GAGGCAGGAC ATACCATTGT TGGTCTAGTC TTGTCGACT 5919 
(21 INFORMATION FOR SEQ ID NO : 2 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1863 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

GAGCTCGGTA CCCGGGGATC ATACTCAAGA GGAGGTAATC CAATGAACAC TAGTCTTAAA 60 

CTCAGCAAAC AACTCAGTTT TGGAGAGGAG ATTGCTAATA GCGTGACCCA TGCTGTGGGT 12 0 

GCAGTCATCA TGCTTATCTT GCT GCCTATT TCATCCATCT ATAGTTATGA AGCACACGGA 18 0 

TTTTTATCCT CTATCGGCGT TTCCATTTTC GTCATCAGTC TCTTTCTCAT GTTCCTATCA 240 

TCCACCATTT ATCACTCTAT GGCCTATGGT TCGACCCACA AATAT GTTTT GCGAATCATT 300 

GACCATTCTA TGATTTACGT TGCCATTGCC GGCTCATACA CGCCCGTTGT CTTGACCTTG 360 

ATGAATAACT GGTTTGGCTA TCTGATTATT GT CAT C CAAT GGGGAACGAC CATCTTTGGT 420 

ATTCTCTATA AAATCTTTGC TAAAAAGGTC AATGAGAAAT TTAGCCTTGC TCTTTACCTG 4 B0 

ATTATGGGCT GGTTGGTTCT GGCTATCATT CCTGCCATTA TCAGTCAAAC GACACCCGTT 540 

TTCTGGAGTC TCATGGTAAC TGGCGGACTC T GTTATAC AG TTGGAGCTGG ATTTTATGCC 600 
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AAGAAAAAAC CTTATTTCCA CATGATTTGG CATCTCTTTA TCCTAGCTGC GTCCGCACTC 660 

CAATACATCG CTATTGTTTA TTACATGTAA AAAAGTTGAG AAATTCAATC TCAACTTTTT 72 0 

5 

TCTTTACACA TATTGATAAA GTACTGGTGC AAGCGCACAT CAT CAGTCAA TTCTGGATGA 7 BO 

AAAGAACTTA CCAACATATT TTTTTCTTGG GCTGCAACAA TTTGATTGTT CACTATTGCT 64 0 

10 AAAATTTCTA CACCCTCACC AACACTACTG ATAATCGGAC CACGGATAAA GGTCATTGGA 900 

ATCTTGCCAA CTCCCTTACA TTCTGCTTCC GTGTAGAAAC TTCCTAATTG GCGCCCATAA 960 

GCATTACGCT CGACCACCAT ATCCATAGTT CCTAGATGAC TCTCTTTCTG AGAAGTGATT 102 0 

15 

TCCTTAGCCA GCAAAATTAA GCCCGCACAG GTCCCAAACA CTGGTAAGCC AG AT A GAAT G 108 0 

GCTTCTCGTA TGGGAAGTAG CATGTTCTGG TCACGTAAGA GCTTGCCCAT GGTTGTAGAC 114 0 

2 0 TCACCACCAG GCAAAATAAA CCCGACAAGT CACTCTGATC TTGCTGAAAA CATCTAGATT 12 0 0 

TCTGAGTTCT ACACTCTCGA CACCTAATTG ATCTAGCACT TTTGCATGTT CTGCAAAGGC 1260 

CCCTTGCAAG GCCAATATTC CGATTTTCAT CTATTTTCCT CGTTCAGCCA TGAGAATTTG 132 0 

25 

GATTCATTTT CATTAATACC AACCATGGCT TCTCCTAAAT CTT CAGAGAT TTGAGCTAGG 138 0 

ATTTGAGGAT TACGGAAGTT AGTCACAGCC TTAACAATGG CACTCGCTCG TTTAACAGGA 144 0 

3 0 TCTCCTGACT TGAAAATACC TGAACCGACA AAGACCCCCT CTGCCCCTAA TT G CAT C ATT 1500 

AACGCAGCAT CTGCTGGCGT TGCAACACCT CCAGCAGCGA AATTTACAAC TGGCAATTTT 1560 

CCATGTTCAT GAACATATTG GACCAATTCT ACAGGGACTT GCAAATCCTT GGCAGCAACA 162 0 

35 

TAAAGCTCGT CCTCACGTAA GTTTTGAATG CGGCGAATTT CCTGATTCAT CAT A C G C ATA 168 0 

TGACGAACAG CTT GGACTAT ATCCCCTGTC CCTGGTTCTC CTTTAGTACG AAT CAT G GAA 174 0 

4 0 GCACCTTCAG CGATACGACG CAAGGCTTCA CCCAAATCCT TAGCACCACA GACAAAAGGA 18 0 0 

ACTTGGAATT CTTTCTTGTC CACATGGAAA CGGTCATCAG CTGGAGATAG AACTTCACTC 1860 

TCG 1863 

45 

(2) INFORMATION FOR SEQ ID NO : 2 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 4 8 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

55 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

60 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO:29: 

TAAAGAAGGT GGATTTGAAG TTAACGGTAA ATT CATC AAA GTTTCTGCTG AACGTGATCC 60 

AGAACAAATC GACTGGGCTA CTGACGGTGT AGAAATCGTT CTTGAAGCTA CTGGTTTCTT 12 0 

TGCTAAGAAA GAAGCAGCTG AAAAACACCT TAAAGGTGGA GCTAAAAAAG TTGTTATCAC 180 

TGCTCCTGGT GGAAACGACG TTAAAACAGT TGTATTCAAC ACTAACCACG ACGTTCTTGA 24 0 

CGGTACTGAA ACAGTTATCT CAGGTGCTTC ATGTACTACA AACTGCTTGG CTCCAATGGC 3 00 

TAAAGCTCTT CAAGACAACT TTGGTGTTGT TGAAGGATTG ATGACTACTA TCCACGCTTA 360 

CACTGGTGAC CAAATGATCC TTGACGGACC ACACCGTGTG GTGACCTTCG CCGTGCTCGC 42 0 

GCTGGTGCTG CAAACATCGT TCCTAACTCA ACTGGTGCTG CAAAAGCTAT CGGTCTTGTA 4 80 

AT C C CAGAAT TGAATGGTAA ACTTGATGGA TCTGCACAAC GCGTTCCAAC TCCAACTGGA 54 0 

TCAGTTACTG AATTGGTCGC AGTTCTTGAA AAGAACGTTA CTGTTGATGA AGTGAACGCA 6 00 

GCTATGAAAG CAGCTTCAAA CGAATCATAC GGTTACACAG AAGATCCAAT CGTATCTTCA 660 

GATATCGTAG GTATGTCTTA CGGTTCATTG TTTGACGCAA CTCAAACTAA AGTTCTTGAC 72 0 

GTTGACGGTA AACAATTGGT TAAAGTT GTA TCATGGTACG ACAACGAAAT GTCATACACT 780 

GCACAACTTG TTCGTACTCT TGAATACTTC GCAAAGATTG CTAAATAATT CTTGAGTTGA 84 0 

TAGAAAGCAA GGCTTTGTGG TCTTGCTTTT TTATATGGAA AAAT GGATGA CACGATCATC 90 0 

CATTCTTTTT TAATTCTTTT TCAAATGTAT TTGAAAGGGT AGTGAAAGTT AGCCTCTCTA 960 

AAGTAAGTGG GTGGGTAAAG GAAAGTCGGA AGGCATGAAG CATAAGCCGG CTTGTCTTTG 102 0 

ATTTACTATT ATAGAGAGGG TCTCCCAGGA TAGGAAGATT ATGATGCAAA AGGTGCACAC 108 0 

GAATCTGATG GGTTCGCCCT GTCTTTAGCT TGCAATGAGC CAAGGAAGT C TTGTTTGAGA 114 0 

ATTGCTTTAA TCTGCTTACA TGCGTTTCAG CATATTTCCC ATTTTTTGCA TCAACTATTC 12 0 0 

TTTTTCTACG ATCATGGCGA TCACGTCCAA TTTTGTCTCT GAAAACAAGT TCTTTTCTGT 12 60 

TGATATTTCC AT CAACTAGA GCCCAATATT CTCTAGAAAT CTCTTTTTTC TCCAATAAGC 132 0 

GATTGAGAAT GGGCAGGATA AAAGGATTTT T G GCAAAG AG AACTAAGCCA CTGGTTTCCA 13 8 0 

TGTCCAGACG ATGAACGACA TAGCAGGTTT GGCCAACATA GGTACTGACA TGGTTAAGAA 14 4 0 

GGGCAATTTC GTTTGGTTGA TTACCATGCG TTTTCATCCC CTCTGGTTTG TTTACAATAA 1500 

TCAAGTGTTG AT C TT GAT AA ACTTCCTGCA CTAAGTCTGG GTTGCCCCAA GGGATCGTCT 1560 

TTTGGGGATA ATCTTCCTCG TCAAAAGTCA ACTGGCAAAC ATCTCCAGGA TTTACGATTT 162 0 

CGTTCCAGCG GACTTCTTCT TGATTTATCA AAATATGTTT CTTGATTCTC AAAAAATGAC 1680 

GGATTTTTCT AGGGATGAGG AGTTGTTCCT CAAGTAATTG CTTTACCGTC ATTTGAGGTA 174 0 
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GAGAGGCGGG TAATGTAAAT GTGAATTGCA 
TGGATAGGAA ATAGCTAAAT TCTTGTCTTC 
5 AAATTAGATA AATTATTTGA GAAATTTCTT 
GAGGACTCTG ATTCTACTAT CTTACGTCGC 
GTAGGTCCGA TTCGAAAATT CTGGCGTCGT 

10 

GGTTTGAGTG CAGGCTTGCT AGTTGGAATC 
GTCAATGATT TGCAAAATGC CTTGAAAACT 
15 GAGGCTGGTG CCTTGTCTGG TCAAAAGGGA 
AACTTGCAGA ATGCTGTTAT TGCGACAGAA 
AACTATGGCC GTTTCTTCTT GGCTATTGTC 

20 

ATTACCCAAC AGCTGGCTAA AAACGCCTAT 
GCGAAAGAAT TTTTCCTTGC CTTAGAATTA 

2 5 AC CAT GT AC C TTAACAACGC TTATTTTGGA 

AAGAAATACT TTGGAGTTTC TGCATCAGAA 
GGGATGCTCA AGGGGCCGGA ACTGTATAAT 

30 

CGGCGCGATA CTGTCTTGCA GAATATGGTT 
ACCGAAGCTG CTGAAGTTGA TATGACTTCG 

3 5 TCAGATTACC GTTACCCCTC TTATTTTGAT 

AATCTAACAG AGGAAGAGAT TGTCAATAAT 
AACTACCAAG CAAATATGCA GATTGTTTAT 

40 

GATGGAACGT TTGCTCAATC AGGAAGTGTA 
GGAGTTGTCG GTCAAGTTGC T GACAAT GAT 
45 ACCCAATCAA AGCGTAGTCC TGGTTCTACA 
GTTGAAGCAG GCTGGGCTTT GAATAAG CAG 
TATAAGGTTG ATAACTAT G C AGGGATCAAA 

50 

TTGGCAGAAT CGCTTAATCT ACCTGCTGTT 
GCTTTTGAGG CAGGCGAAAA ATTCGGACTC 
55 GTCGCCTTGG GAAGCGGTGT TGAAACCAAC 
TTTGCAAATG AAGGTTTAAT GCCTGAAGCT 
GGACAAGTTA TTGCGAGTCA TAAAAATT CA 

60 

GACAAGATGA C CAGTAT GAT GTTGGGGACT 



TACAGATATT GTAACAAAAA AAGCCCTATT 18 00 

CTATGATGAA GATGATAAAA TAAACGCATG 18 60 

TCTCTTTTTA AAAAAGAAAC AAGTGAACTA 192 0 

TCTCGTAGTG ATCGAAAAAA ATTAGCCCAA 1980 

TAT CAT C T AA CAAAGATTAT CCTTATACTA 2 04 0 

TATTTGTTTG CTGTAGCCAA GTCGACCAAT 2100 

CGGACTCTTA TTTTTGACCG TGAAGAAAAA 2160 

ACCTATGTTG AGCTGACTGA CAT CAGTAAA 222 0 

GACCGTTCTT TCTATAAAAA TGACGGGATT 22 8 0 

ACTGCTGGAC GTTCAGGTGG TGGCTCTACC 2 34 0 

TTATCGCAGG AT CAAACT GT T GAG AG AAAA 2 4 00 

AGCAAAAAAT ATAGTAAGGA GCAAATTCTA 2 4 60 

AATGGTGTGT GGGGTGTAGA AGATGCGAGT 2520 

GTGAGTCTGG ATCAAGCTGC GACTCTGGCA 25 8 0 

CCCTTGAATT CCGTAGAAGA TTCTACTAAT 2 64 0 

GCAGCAGGAT AT ATT GAT AA AAACCAAGAA 27 00 

CAATTGCACG AT AA GT AT GA AG G AAAAAT C 2 7 60 

GCGGTGGTTA ATGAAGCTGT TTCCAAGTAT 2 82 0 

GGCTACCGCA TTTACACAGA GCTGGACCAA 28 80 

GAAAACACAT CGCTATTTCC GAGGGCAGAG 294 0 

GCTCTCGAAC CGAAAACAGG GGGAGTTCGT 3 000 

AAAACTGGAT TCCGGAATTT CAACTATGCA 3 060 

ATTAAGCCTT TAGTTGTTTA TACACCAGCA 3120 

TTGGATAACC AT AC CAT G C A GTATGATAGC 3180 

ACAAGTCGAG AAGTTCCTAT GTATCAATCC 324 0 

GCCACTGTTA ATGATTTGGG TGTTGACAAG 3300 

AACATGGAAA AGGTCGACCG TGTTCTTGGT 33 60 

CCTCTTCAAA TGGCTCAAGC ATACGCTGCC 342 0 

CATTTTATTA GTAGAATTGA AAATGCTAGT 34 8 0 

CAAAAACGGG TGATTGATAA GTCTGTAGCT 354 0 

TTCACCAACG GTACCGGTAT TAGTTCATCG 3600 
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CCTGCAGACT AT GTCATGGC AGGGAAAACT 
TACACAAGTG ACCAGTGGGT AATTGGTTAT 

5 

GGCTTTCCGA CCACTGATGA AAATCACTAT 
CATGTCTTTA GAAACATTGC CAATACTATT 
10 GTTGAAAATG CTTATAAGCA AAATGGAATT 
ACCAATGATA ATAGCCAGAC AGATGATAAT 
CTAGTAGATG AGGCTAGCCG GGCTATCTCA 

15 

ATATGGGATT CGATAGTCAA TCTATTTCGC 
TTATAATGGA TAAGATGGAG GCGTTATGGC 

2 0 TTGTGGTTCG AGAAACTATT CAATCAAGAT 

AG T AAAT AAA TTTTGTAAGC ATTGTGGCAA 
GAGAGCGATG CGTTTTATTG GAGATAT TTT 

25 

TCGCAAGGAA AGCTGGAGAG ATTTTCGTTC 
AATTATT TAC ATTTTTGACC AGTTGATTGT 

3 0 TTAGAAGATT AGTGGAGTTA ATTACACTAG 

GGATAGTTTT GATAAAGGAT GGTTTGTTTT 
GAAAGAAAAT CTATTACAAC GTGCACAAAC 

35 

TGAAATTCCA ACACAAACAG TGCAAGTTGA 
AAATCGCTTT CCAGGTTATG TTCTTGTAGA 

4 0 TGTTCGAAAC GCACAGAGTC CTACAAAATT 

TGAAGAGGTT CGTTCATTAT T AAAT GAG G C 
AAATCGTGAA ACT CACAAGT TAATTGCAGA 

45 

TACACAAATT AAAGCTCTTT ACGAAACAGG 
CTCATGCACT ATC CTATGAT GAAGTAAAGT 

5 0 AGAGGCTGGA GCCT CTCTTT TTTGTGCAGT 

ATGGAACAAA TGTGTTTTCT AATCTGTTAG 
AAAGAATTGT ATGAAGAAGT CCAAGGGACT 

55 

CATTTATGGG AATTGTCGGA TTGGGACCAA 
AG T AGAGAAG AAGGACTGGT AGACGATATT 
60 TTTCGAAATC GAATTTTAGA CTATATCCGT 



GGAACAACTG AAGCAGTTTT CAATCCGGAG 3 660 

ACTCCGGATG TAGTGATTAG CCACTGGCTT 372 0 

CTAGCTGGCT CTACTTCAAA CGGTGCAGCT 37 8 0 

TT AC CTTATA CGCCAGGAAG TACCTTTACG 384 0 

GCACCAGCCA ATACAAAAAG ACAAGTACAA 3900 

TTGTCTGATA TTCGAGGGCG TGCGCAAAGT 3 960 

GATGCGAAGA TTAAGGAAAA GGCTCAAACA 4 02 0 

TAAGATGCTT GTCAAAGCCT AGCTTTCTTG 4 08 0 

ACTAAAAAAA GCAAGCCTAG CTTGTGCGGT 414 0 

CAGCGGAAAC CCCAAGCCTA CACGACTAGA 4200 

GTACACTACA CACAGAGAAA CGAGATAGGA 42 6 0 

TAGACTTCTT AAAGACACAA CATGGCCAAC 4 32 0 

TATCATGGAA TACACAGCTT TCTTTGTAGT 43 8 0 

TTCAGGTTTG ATTCGATTTA TTAACATTTT 44 4 0 

AAATCTTCTA TTTATGAAAG GAAAT AT CAT 45 00 

ACAAACTTAT TCTGGTTATG AAAATAAGGT 4 5 60 

CTACAATATG TTGGATAATA TTCTACGCGT 4 62 0 

AAAAAATGGA AAGAGAAAAG AAGTAGAAGA 4 68 0 

AATGGTCATG ACAGATGAAG CTTGGTTTGT 474 0 

CATTTCAGAA CAAAC AG CTT AT GAAAT T G A 4 8 00 

AC GAAAT AAA GCTGCTGAAA TTATTCAGTC 4 8 60 

AGCATTATTG AAATACGAAA CATTGGATAG 4 92 0 

AAAGATGCCT GAAAGCAGTA GAAGAGGAAT 4 98 0 

CAAAAATGAA TGACGAAAAA TAACCCTGAG 5 04 0 

TTAGGAGCTA AAGGGAACAG AATGGAGAAA 5100 

ACT GTAT CTA GAAAGGGGAA AATTATGATT 5160 
GTGTATAAGT GTA GAAAT GA ATATTACCTT 522 0 

GAAGGCATGC TCTGCTTACA TGAATTGATT 528 0 

CCACGTTTAA G GAAAT ATTT CAAAACCAAG 534 0 
AAGCAGGAAA GTCAGAAGCG TAGATACGAT 54 00 
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/•AAGAACCCT AT GAAGAAGT GGGTGAGATC CCCGGTACCG AGCTCGAA 
(2) INFORMATION FOR SEQ ID NO: 30: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TAGTGAATTC GAGCTCGGTA CCCGGGGATC GTTTCTCGGT TCTTTTGGAG CACAAGGGCA 60 

TCCATCCCAT TGTCTATATT TCCAAAATGG ATTTGTTGGA AGATAGGGGA GAACTGGATT 120 

TTTACCAGCA GACCTATGGT GACATCGGCT ATGACTTTGT GAC CAGTAAA GAGGAACTCC 180 

TGT CTTTGTT AACAGGCAAG GTTACGGTCT TTATGGGGCA GACAGGTGTT GGGAAGTCAA 24 0 

CTCTTCTCAA TAAAATCGCA CCAGACCTCA ATCTTGAAAC GGGAGAAATT TCAGACAGTC 30 0 

TAGGTCGCGG TCGCCATACC ACTCGAGCTG TTAGTTTTTA CAATCTCAAC GGGGGTAAAA 360 

TCGCAGATAC AC CAG GATTT TCATCCTTGG ACTATGAAGT ATCAAGGGCT GAAGACCTCA 42 0 

ATCAGGCTTT CCCAGAGATT GCTACTGTTA GCCGAGATTG TAAGTTCCGT ACTTGTACCC 4 80 

AT A C C CAT G A GCCGTCTTGT GCCGTCAAAC CAGCTGTTGA AGAGGGTGTT ATTGCAACCT 54 0 

TCCGTTTTGA CAATT AC CT G CAATTCCTTA GTGAAATTGA AAATCGTAGA GAAACCTATA 60 0 

AAAAAGTCAG CAAAAAAATT CCAAAATAAG GAGAAACCTA TGTCTCAATA CAAGATTGCT 660 

CCGTCAATTC TGGCAGCAGA TTATGCCAAC TTTGAACGTG AAATCAAACG TCTAGAAGCA 72 0 

ACTGGGGCAG AATAT GCCCA TATCGATTCT GGACAGTCAT TTTGTACCGC AAATCAGTTT 780 

TGGTGCAGGT GTGGTCGAGA GCTTCGTCCT CATAGTAAGA TGGTTTTCGA TTGCCACTTG 64 0 

ATGGTGTCAA ACCCTGAGCA TCATCTGGAA GATTTTGCGC GTGCAGGTGC AGACATCATC 900 

AGTATCCATG TAGAAGCAAC ACCTCATATT CATGGCGCCC TCCAAAAAAT TCGTTCACTC 960 

GGAGTTAAGC CTT CAGTCGT TATCAATCCT GGCACACCAG TTGAAGCCAT CAAGCACGTC 102 0 

CTTCATCTAG TGACAAGTTT 104 0 

(2) INFORMATION FOR SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATATCACGAC GGAGCCATAC TACCGATTTT CTTAAGCATA GCGCCACCTT TACCGATGAT 60 

AATCCCTTTT TGGCTATCGC GCTCGACCAT GATGGTTGCA CGGATGTGAA CCTTGTCTGT 12 0 

CTCTTCGTCT CGTTTCATAG AGTCAACAAC TACTGCTACA GAATGCGGAA TCTCTTCACG 180 

AGTTAGGTGC AAGACTTTCT CGCGAACCAT TTCTGAAACT AAGAAACGTT CTGGATGATC 24 0 

TGTGATTTGA TCAGACGGGA AATATTGGAA ACCTTCATCC AGATTTTCAC TCAAAATATC 30 0 

CACTAGACGA GACACGTTAT TTCCCTGAAG GGCTGAGATT GGAACAATTT CCTTAAAGTC 36 0 

CATTTGATTA CGGAAGTCAT CAATCTGAGA CAAGAGCTGG TCTGGATGGA CCTTATCGAT 42 0 

TTTATTCACC AC CAAAATCA CAGGAACCTT GGCAGCCTTG GAGACGCTCG ATAATCATAT 4 80 

CGTCCCCCTT ACCACGCGCT T CAT CAG CAG GCACCATGAA AAGAACAGTG TCCACTTCGC 54 0 

GAAGGTACTG TAGGCAGACT CAACCATGAA ATCTCCGAGA GCTGTTTTAA GTTTGTGAAT 60 0 

CCCTGGTGTG TCGATAAAGA CAATTTGCTC CTTATCAGTC GTGTAATTCC CATGATTTTA 660 

TTGCGCGTTG TCTGCGCCTT GTCACTCATG ATGGCAATCT TTTGCCCCAT AACGTGATTT 72 0 

AAAAAGGTTG ACTTCCCAAC ATTGGGACTC CTAAAATGGC TACAAACCTG ATTTAAAATT 78 0 

CATAATTCC 78 9 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1233 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
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TGAGATAAAC TTGCGACTCA TAT GAG AAT A TGAATCAAGC CGTCCTCGTG AACCCCGATA 60 

TCAACGAAGG CACCGAAGTC AACAACGTTA CGCACAACAC CTTCTAGCTT CTGACCTACC 12 0 

5 

ACTAGGTCTT TGATATCTAG GACATCTTGG CGAAGCACAG GTGCGTCAAA GGAATCACGG 180 

AAATCTCGAC CTGGTTTGAG AAGATCTGCA ATGATATCTT TAAGGGTTTC TGGACCGAGG 2 4 0 

10 TCTAGCTCTT GAGCCATTTC CTTGACTGAA AGGGACTTGA GTTTGCTTGG GCTTCTTCGT 300 

TTAGGTCTTT AATATCTAAA CGTTTGAAGA GTCCTTAACT GCAGTGTAAT TCTCTGGGTG 360 

AACTCCTGTA TTAT CAAGGA TATTGCTACT TTCAGGGATA CGAAGGAAAC CAGCAGCCTG 42 0 

15 

CTCAAAGGCC TTGGCTCCCA GACGAGGAAC TTTCTTGATT TGGGCGCGTG AAGTGATTTT 4 80 

TCCTTCTTCC TCGCGGTATT TGACAATATT TTCAGAGATA GTTTTGTTGA GTCCAGCTAC 54 0 

2 0 GTGTGAAAGA AGAGCTGGGC TAGCTGTATT GACATTGACA CCAACTTGGT TAACCACTGT 600 

ATCGACAACA AAGTCCAGAC TCTCAGATAG TTTCTTCTGA CTGACATCGT GTTGGTATTG 660 

ACCGACACCA ATTGACTTAG GATCGATTTT GACCAATTCC GCAAGAGGAT CTTGCAAACG 72 0 

25 

ACGGGCGATA GAAATGGCAG AG C GTTTTT C AACGGTCAAG TCTGGAAACT CCTGACGAGC 780 

AAGTTCGCTG G CAGAATAG A CAGAAGCACC ACTTTCATTA AC GAT AAC AT AGCTGACTTC 84 0 

3 0 AGGGAAATCT TT CAGAACTT CCGCTACAAA AGCTTCACTT TCACGACTGG CCGTTCCATT 900 

TCCAATGGCA ATAATCTCTA CACCGTATTG AC CAATTAAA TCTGCTAAAT CTTTCTTGGC 960 

TTCTTCGATT TGACGAGCTG ATGCTGGTTT AACAGGATAA ATAACCTGAG TTGTCAGCAT 1020 

35 

TTTTCCTGTT GCATCCACGA CAGCTAACTT GGCACCTGTA CGAAAGGCTG GGTCAAATCC 108 0 

AAGAACCACG CGCCCTTTCA GTGGAGCAAC CAAGAGGAGA TTGCGCAGAT TGTCAGAAAA 114 0 

4 0 AAGTTGGATA GCTCCCTCTT CAGCTTTCTC AGTTAATTCT GTCCGAATAC GACGCTCAAT 1200 

AGCAGGCAAG ACCTTTTTCT TAACGGATTG CTG 1233 
(2) INFORMATION FOR SEQ ID NO: 33: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6679 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 
50 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

55 

(iv) ANTI-SENSE: NO 



6 0 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
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ACAAGGCGTC ATCCGTGTAT TTCTTAAATA 
CTTGAAACGT GTTTCTAAAC CAGGACTTCG 

5 

AGTT CTTAAC GGACTTGGAA TTGCCATCCT 
AGAAGCACGC CAAAAGAATG TTGGTGGTGA 
10 TACAAAGCTC GTAAAGAACA AAGCAAAATT 
GCCAACTTAT CTATTTTGCA CAGTTCTTAG 
AAGTATCTGA ACCCCGTGAA AACTGGCCGT 

15 

ACATGTCACG TATTGGTAAT AAAGTTATCG 
ATGACAACGT TGTAACTGTA AAAGGACCTA 

2 0 AT AT T G AAAT CCGTGTGGAA GGTACTGAAG 

AAATGAAAAC TAT C CACGGA ACTACTCGTG 
CAGAAG GAT T CAAGAAAGAA CTTGAAATGC 

25 

GATCTAAACT TGTTTTGGCT G TT G GT AAAT 
GAATTACTTT TGAACTTCCA AACCCAACAA 

3 0 TAGTTGGTCA AACAGCTGCT TACGTACGTA 

AAGGTATCCG TTACGTTGGT GAATTCGTTC 
GTTGAGTGGT T G AT CAT C AA CCACCAACCT 

35 

AAAACTAAAG AGGTGAAAAC TGTGATTTCA 
CGCCACCGTC GCGTTCGGGA AAACTCTCTG 

4 0 TCCGTTCTAA TACAGGCATC TACGCTCAAG 

CAAGTGCTTC AACTCTTGAT AAAGAAGTTT 
CTGTCGGTAA ACTCGTTGCA GAACGTGCAA 

45 

ACCGCGGTGG ATATCTATAT CACGGACGTG 
ACGGATTGAA ATTCTAATAG GAGGACACTA 

5 0 ATTAGAAGAA CGCGTAGTTG CTGTCAACCG 

TCTTCGTTTC GCAGCTCTTG TTGTTGTTGG 
TGGTAAAGCT CAAGAAGTTC CAGAAG C AAT 

55 

CTTGATCGAA GTTCCTATGG TTGGAACAAC 
TGGAGCTAAA GTATTGTTGA AACCTGCTGT 
60 AGTTCGTGCC GTTGTGGAAT TGGCAGGTGT 



CGGACCAAAT GGTGAGAAAG TTATCACTAA 60 

TGTCTACAAA AAACGTGAAG ACCTTCCAAA 12 0 

TTCAACTTCT GAAGGTTTGC TTACTGATAA 180 

GGTTATCGCT TACGTTTGGT AAAVTCAAGA 24 0 

AGGAAGTTGG AG AAGTTT GT TTACAAACAA 300 

ATCGTGTTCA GTTCAGCTCT TGAACTAAAT 3 60 

TCTGGCTGAC AATTTAACAG GAGAAAATAA 42 0 

TGTTGCCTGC TGGTGTTGAA CTCGCTAACA 4 80 

AAGGAGAACT TACTCGTGAG TTCTCAAAAG 54 0 

TAACTCTTCA CCGTCCAAAC GATTCAAAAG 600 

CCCTTTTGAA CAACAT GGTT GTTGGTGTAT 660 

GTGGGGTTGG TTACCGTGCA CAGCTTCAAG 72 0 

CTCATCCAGA CGAAGTTGAA GCT CCAGAAG 780 

CAATCGTTGT TAGCGGAATT TCAAAAGAAG 64 0 

GCCTTCGTTC ACCAGAACCA TATAAAGGTA 900 

GTCGTAAAGA AGGTAAAACA GGTAAATAAT 960 

ATTTTCCAAC TTTGTGCATA GCAACGATTT 102 0 

AAACCAGATA AAAACAAACT CCGCCAAAAA 1080 

GAACTGCTGA TCGCCCACGT TTGAACGTAT 114 0 

TGATTGATGA CGTAGCGGGT GTAACGCTCG 12 00 

CAAAAGGAAC TAAAACTGAA CAAGCCGTTG 12 60 

ACGCTAAAGG TATTTCAGAA GTGGTGTTCG 132 0 

TGAAAGCTTT GGCTGATGCA GCTCGTGAAA 1380 

GAAAATGGCA TTTAAAGACA ATGCAGTTGA 144 0 

TGTTACAAAA GTTGTTAAAG GTGGACGTCG 150 0 

TGACCACAAT GGTCGCGTAG GATTT GGTAC 1560 

CCGTAAAGCA GTAGAT GAT G CTAAGAAAAA 162 0 

AATCCCACAC GAAGTT CTTT CAGAATTCGG 168 0 

AGAAGGTTCT GGAGTTGCCG CTGGTGGTGC 1740 

GGCAGATATT ACATCTAAAT CACTTGGTTC 1800 
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-79- 

TAACACT C C A AT CAA CAT T G TTCGTGCAAC TGTTGAAGGT TTGAAACAAT TGAAACGCGC 18 60 

TGAAGAAATT GCTGCCCTTC GTGGTATTTC AGTTTCTGAT TTGGCATAAG AAAGGGGATA 192 0 

AAATGGCTCA AATTAAAATT ACTTTGACTA AGTCTCCAAT CGGACGCATT CCATCACAAC 1980 

GTAAAACTGT TGTAGCACTT GGACTTGGCA AATTGAACAG CTCTGTTATT AAAGAAGATA 2 04 0 

ACGCTGCTAT CCGTGGTATG ATTACAGCAG TATCTCACTT AGTAACAGTT GAAGAAGTAA 2100 

ACTAATGAAG TTTTAGGGGA TGTGCACTGT ACCATCCCCT AAAAC TAGAT ATAGTCATCT 2160 

AT GAT GACAT CGTATAGGCG AGTTGATGGG GGAGACAACC TTTTCTCCCT TATCGGCGCT 2220 

AGCATTTTAC AAAAGAG GAG AAAATAAAAA TGAAACTTCA T GAATT G AAA CCTGCAGAAG 22 8 0 

GTTCTCGTAA AGTACGTAAC CGCGTTGGTC GTGGTACTTC ATCAGGTAAC GGTAAAACAT 2 34 0 

CTGGTCGTGG TCAAAAAGGT CAAAAAGCTC GTAGCGGTGG CGGAGTTCGC CTTGGTTTTG 2 4 00 

AAGGTGGACA AACTCCATTG TTCCGTCGTC TTCCAAAACG TGGATTCACT AACATCAACG 2460 

CTAAAGAATA CGCAATTGTG AACCTTGACC AATTGAACGT CTTTGAAGAT GGTGCTGAAG 2 520 

TAACTCCAGT TGTTCTTATC GAAG CAGGAA TTGTTAAAGC TGAAAAGTCA GGTATTAAAA 2 58 0 

TTCTTGGTAA CGGTGAGTTG AC TAA GAAAT TGACTGTGAA AGCAGCTAAA TTCTCTAAAT 2 64 0 

CAGCTGAAGA AGCTATCACT GCTAAAGGTG GTT CAGTAGA AGTCATCTAA GAGAGGTGAC 2 7 00 

CTATGTTTTT TAAATTATTA AGAGAAGCTC TTAAAGTCAA GCAGGTTCGA TCAAAAATTT 2 7 60 

TATTTACAAT TTTTATCGTT TTGGTCTTTC GTATCGGAAC TAGCATTACA GTTCCTGGTG 2 82 0 

TGAATGCCAA TAGCTTGAAT GCTTTAAGTG GATTATCCTT CTTAAACATG TTGAGCTTGG 2 8 80 

TGTCGGGGAA TGCCCTAAAA AACTTTTCGA TTTTTGCCCT AGGAGTTAGT C C C TAT AT CA 2 94 0 

CCGCTTCTAT TGTTGTCCAA CTCTTGCAAA TGGATATTTT ACCCAAGTTT GTAGAGTGGG 3000 

GTAAACAAGG GGAAGTAGGT CGAAGAAAAT TGAATCAAGC T AC T C GT TAT ATTGCTCTAG 3 060 

TTCTCGCTTT TGTGCAATCT AT C GGGATTA CAGCTGGTTT TAATAC CTT G GCTGGAGCTC 312 0 

AATTGATTAA AACTGCTTTA ACTCCACAAG TTTTTCTGAC GATTGGTATC ATCTTAACAG 318 0 

CTGGTAGTAT GATTGTCACT TGGTTGGGTG AGCAAATTAC AGATAAGGGA T AC G GAAAC G 324 0 

GTGTTTCCAT GATTATCTTT GCCGGGATTG TTTCCTCAAT TCCAGAGATG ATTCAGGGCA 3300 

TCTATGTGGA CTACTTTGTG AAC GTCCCAA GTAGCCGTAT CACTTCATCT ATCATTTTCG 3360 

TAATCATTTT GATTATTACT GTATTGTTGA TTATTTACTT TACAACTTAT GTT CAACAAG 3420 

CAGAATACAA AATTCCAATC CAA TAT ACTA AGGTTGCACA AGGTGCTCCA TCTAGCTCTT 3480 

ACCTTCCGTT AAAGGTAAAT CCTGCTGGAG TTATCCCTGT TATCTTTGCC AGTTCGATTA 3 54 0 

CTGCAGCGCC TGCGGCTATT CTTCAGTTTT TGAGTGCCAC AG GT CAT GAT 7GGGCTTGGG 3 600 

TAAGGGTAGC ACAAGAGATG TTGGCAACTA CTTCTCCAAC TGGTATTGCC ATGTATGCTT 3660 
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T GT T GAT TAT TCTCTTTACA TTCTTCTATA 
CAGAGAGCCT ACAAAAGAGT GGTGCCTATA 

5 

AAGAATATAT GTCTAAACTT CTTCGTCGTC 
TGATTTCCAT TTTACCGATT GCAGCTAAAG 
10 TTGGTGGAAC AAGTCTCTTG AT CAT TAT C T 
AAGGTTACCT ATTGAAACGT AAGTATGTTG 
TACTGAATCA GTAAATACTG AGGGAGT GGA 

15 

ATCTCCCCTC TTCTATTTTG TTTTTAAATC 
AAACAAAATA AGGAGAT CAA ATCATGAATC 

2 0 AGGGAACTCA AGCAGCAAAA ATCGTAGAAC 

ATATGTTCCG CGCTGCAATG GCAAAT CAAA 
TTGACAAGGG TGAATTGGTT CCTGACGAAG 

25 

C AC AAGAT G A TATTAAAGAA ACAGGATTCT 
AAGCTCATGC CTTGGACAAA ACATTGGCTG 

3 0 AT ATT GAAGT GAACCCTGAC AGCCTCTTGG 

TAACTGGAGA AACTTTCCAC AAGGTCTTTA 
ACTACCAACG T GAAGAT GAT AAGCCTGAGA 

35 

CTCAAGGAGA ACCAATCATT GCTCACTACC 
GTAATCAAGA TATCAATGAT GT CTTCTCAG 

4 0 AAAGCGTTTT TCACACTTGC AAAAATCCGC 

ATAATTGTTG TCTCTGTGTC TAGAGGCATC 
GGCAAAAGAC GATGTGATTG AAGTTGAAGG 

45 

GTTTACGGTT GAACTTGAAA AT GGACAT CA 
TAAAAAC T AT ATTCGTATTT TAGCGGGAGA 
50 CTTGACACGT GGACGTATCA CTTACCGCTT 
AAT GAAAGT A AGACCATCGG TCAAACCAAT 
TGGTCGTGTT ATGGTAATTT GCCCAGCAAA 

55 

TAGAAAGGAG AAAACAT GGC T C GT ATT G CT 
GTAATCTCAT TGACTTATGT TTATGGTATC 
6 0 GCTGCTGGAA TCTCAGAAGA TGTTCGTGTA 



CGTTTGTACA GAT T AAT C C T GAAAAAGCAG 372 0 

TCCATGGAGT TCGTCCTGGT AAAGGTACAG 37 8 0 

TTGCAACTGT TGGTTCCCTC TTCCTTGGTG 384 0 

ATGTATTTGG TCTTTCTGAT GTTGTTGCCT 39 0 0 

CTACAGGTAT CGAAGGAATC AAGCAATTGG 3 96 0 

GTTT CAT GGA CAGAACAGAA TAAAAGTATT 4 02 0 

GGTTTAAACT CTGACATTTG TAAGAGTT GG 4 08 0 

GGGGTGAAAA AACTTTTTGC TTCTATTTAA 414 0 

TTTTGATTAT GGGCTTACCT GGTGCAGGTA 4200 

AATTCCATGT TGCACATATC TCAACAGGTG 42 60 

CTGAAATGGG TGTTCTTGCT AAGT CAT ATA 4 32 0 

TTACAAATGG AAT C GTAAAA GAACGCCTTT 4 38 0 

TATTGGATGG TTACCCACGT ACAATTGAAC 44 4 0 

AACTTGGCAT TGAACTAGAA G G T ATT AT CA 4 500 

AACGTTTGAG TGGCCGTATC ATCCACCGCG 456 0 

ACCCACCAGT T GACT AT AAA GAAGAAGATT 4 62 0 

CAGTAAAACG TCGTTTGGAT GTTAATATTG 4 68 0 

GTGCCAAAGG TTTGGTTCAT GACATCGAAG 474 0 

ATATTGAAAA AGTATTGACA AATTTGAAAT 4 800 

TACAAATGTT ATACT GAAAT AGTCTGACTT 4860 

GAAT CGAAAT TTATGGAGGT GCTTTTGCGT 4 92 0 

CAAAGTAGTT GAT A C AAT G C C GAAT GCAAT 49 8 0 

GATTTTAGCA ACAGTTTCTG GTAAAATTCG 50 4 0 

TCGTGTTACT GTCGAAATGA GTCCATATGA 510 0 

TAAATAATCG AAAAACTTGG AGGGATAAGA 5160 

TTGCGAATAC TGTAAAGTTA TTCGTCGTAA 522 0 

TCCAAAACAC AAACAACGTC AAGGATAAGA 52 8 0 

GGAGTTGATA TTCCAAATGA CAAACGCGTA 534 0 

GGACTTGCAA CATCTAAGAA AATTTTGGCT 54 00 

CGTGATCTTA CAT CAGAT CA AGAAGATGCT 54 60 
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ATCCGTCGTG AAGTGGATGC AATCAAAGTT GAAGGTGACC TTCGTCGTGA AGTAAACTTG 552 0 

AA CAT C AAAC GTTTGATGGA AATCGGTTCA TACCGTGGTA TCCGTCACCG TCGTGGACTT 558 0 

CCTGTCCGTG GACAAAATAC TAAAAACAAC GCTCGCACTC GTAAAGGTAA AGCTGTTGCG 5 64 0 

A7TGCTGGTA AGAAAAAATA ATATAGGAGG TAA^AGTCTT GGCTAAACCA ACACGTAAAC 5700 

GTCGTGTGAA AAAGAATATC GAATCTGGTA TTGCTCATAT TCACGCTACA TTTAATAACA 57 60 

CTATTGTTAT GATTACTGAT GTGCATGGTA ATGCAATTGC TTGGTCATCA GCTGGTGCTC 5 82 0 

TTGGTTTCAA AGGTTCTCGT AAATCTACAC CATTCGCTGC TCAAATGGCT TCTGAAGCTG 58 8 0 

CTGCTAAATC TGCACAAGAA CACGGTCTTA AATCAGTTGA AGT TACT GTA AAAGGTCCAG 594 0 

GTTCTGGTCG TGAGTCAGCT ATTCGTGCGC TTGCTGCCGC TGGTCTTGAA GTAACAGCAA 6000 

TTCGTGATGT GACTCCAGTG CCACACAATG GTGCTCGTCC TCCAAAACGT CGCCGTGTAT 6060 

AAT CAT C GCA TTACACT GCT TTTCGTTTAA GAGGGAGTAA CTAAATGATC GAGTTTGAAA 612 0 

AACCAAATAT AACAAAAATT GATGAAAATA AAGATTATGG CAAGTTTGTA ATCGAACCAC 6180 

TTGAACGTGG CTACGGTACA ACTCTTGGTA ACTCTCTTCG TCGTGTACTT CTAGCTTCTC 624 0 

TACCAGGAGC AGCTGTGACA TCTATCAACA TTGATGGTGT GT TACAT GAG TTTGACACAG 6300 

TTCCAGGTGT TCGTGAAGAC GT GAT GCAAA TCATTCTGAA CATTAAAGGA ATTGCAGTGA 6360 

AATCGTACGT TGAAGACGAA AAAATCATCG AACTGGATGT TGAAGGTCCT GCTGAAGTAA 642 0 

CAGCTGGTGA CATTTTGACA GATAGC GAT A TTGAAATTGT AAATCCAGAT CATTATCTCT 64 8 0 

TTACAATCGG TGAAGGTTCT TCTCTAAAAG CGACTATGAC TGTTAACAGT GGTCGTGGAT 654 0 

ATGTACCTGC TGACGAAAAT AAAAAGGATA ATGCACCAGT TGGAACACTT GCTGTAGATT 660 0 

CTATTTATAC AC CAGTTACA AAAGTCAACT ATCAAGTGGA ACCTGCTCGT GTAGGTAGCA 6660 

ATGATGGTTT CGACTCTAG 667 9 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1703 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
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AGAATACCTT GGGGCAACTG TTCAAGTCAT TCCTCATATC ACAGATGCTT TGAAAGAAAA 60 

AATCAAGAGT GCCGCTCTAA CGACCGACTC 7 GAT G? CAT T ATCACAGAGG TTGGTGGAAC 12 0 

AGTAGGAGAT ATCGAGTCCT TGCCATTCCT AGAGGCTCTT CGCAGATGAA GGCAGATGTG 180 

GGGCGGATAA TGTCATGTAT ATCCATACAA CCTTCTTCCT TACCTCAAGG CTGCTGGTGA 24 0 

AATGAAACCA AACCAACCCA ACACTCTGTC AAAGATTGCG TGGCTTGGGA ATCCAACCAA 300 

ATATGTTGGT TATTCGTACA GAAG AG C C AG CTGGTCAAGG AATTAAAAAT AAACTGGCCC 360 

AGTTCTGTGA TGTGGCACCA GAATCCCTAA TCGAATCGTT GGATGTTGAA CACCTTTACC 42 0 

AAATTCCACT GAACTTGCAG GCACAAGGGA TGGACCAAAT TGTTTGTGAT CATTTGAAAT 480 

TAGACGCACC AGCAGCGGAT ATGACAGAAT GGTCAGCCAT GGTGGACAAG GTCATGAACC 54 0 

TCAAGAAACA AGTTAAGATT TCCCTTGTTG GTAAGTATGT GGAGTTGCAA GAT GCCTATA 600 

TCTCAGTGGT CGAAGCCTTG AAACACTCTG GCTATGTCAA TGATGTAGAA GTTAAAATCA 66 0 

ATTGGGTCAA TGCCAATGAT GTGACAGCAG AGAATGTAGC AGAACTCTTG TCTGATGCGG 72 0 

ACGGGATCAT CGTACCAGGT GGTTTTGGTC AACGTGGTAC AGAAGGGAAA ATCCAAGCCA 78 0 

TCCGCTATGC GCGTGAAAAT GATGTTCCAA TGTTGGGAGT CTGCTTGGGA ATGCAGTTGA 84 0 

CATGTATCGA GTTTGCTCGT CACGTTTTAG GTCTTGAAGG TGCCAATTCT GCAGAGCTTG 900 

CACCAGAAAC AAAATACCCT AT CAT T GAT A TCATGCGTGA TCAGATTGAT ATTGAGGATA 9 60 

TGGGTGGAAC CCTTCGTTTG GGACTTTATC CGTCTAAGTT GAAACGTGGC TCTAAGGCTG 102 0 

CTGCTGCTTA TCACAATCAA GAAGTGGTGC AACGCCGTCA CCGTCACCGT TATGAGTTTA 10 6 0 

AATAATGCCT TCCGTGAGCA GTTTGAGGCA GCAGGTTTGT CTTTTCAGGA GTTTCTCCAG 1140 

ACAATCGTTT GGTAGAAATC GTGGAAATCC TGAAAATAAA TTCTTTGTAG CTTGTCAGTA 12 0 0 

TCACCCTGAA CTGTCAGCCG TCCAACCGAC CAGAAGAACT CTACACTGCC TTTGTTACTG 12 60 

CAGCGGTTGA GAACAGCAAT TAG C AAAAT C AGAACCTTTG AGAAAAATCT CAGAGGTTTT 132 0 

TTGCATACGA TGATATTGCA GTATAT CTGA GGTAGGAGTC CTCTGTATGT ACCTGCTACC 138 0 

GTTGAAATCA ATAGCGACTC CCTCTTGCCC TGTGCTAGTG AATGGATTTA TCAGTATATT 14 4 0 

GAAATGAAAT AAAATTT GAA CAAATTAATT CGGAAAGCCA AATCAATTTC TAG CAAAGT T 150 0 

TTAGGAACTG GATTGTATAG TGAATTGAAA TAAGATGTGA ACATCTCTAT CAGGAAAGTC 1560 

AAATTAATTT ATAGAAATAT TTTAGCAGTC AAGATGGACT GTTATAGATT CAATATACTA 162 0 

TACTTTTTTA ATTTAATCCA CTATAATAAA AT GAAAT AAT AACAGGACAA ATCGTTCAGG 168 0 

ACAGTCAAAT CGACTCTAGA GGA 17 03 
(2) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1620 base pal. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genonu 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 5 : 

ATTGTAAAAC ACCAAGGAAA AACAG CTAAA GAAGCGAAAG AATTGGCCAT TGACTACATG 60 

AATAAGGTTG GCATTCCAGA CGCAGATAGA CGTTTTAATG AATACCCATT CCAATATTCT 12 0 

GGAGGAATGC GTCAACGTAT CGTTATTGCG ATTGCCCTTG CCTGCCGACC TGATGTCTTG 180 

ATCTGTGATG AGCCAACAAC TGCCTTGGAT GTAACTATTC AAGCTCAGAT TATTGATTTG 24 0 

CTAAAATCTT TACAAAACGA GTATCATTTC ACAACAATCT TTATTACCCA CGACCTTGGT 300 

GTGGTGGCAA GTATTGCGGA TAAGGTAGCG GTTATGTATG CAGGAGAAAT CGTTGAGTAT 360 

GGAACGGTTG AGGAAGT CTT CTATGACCCT CGCCATCCAT ATACATGGAG TCTCTTGTCT 42 0 

AGCTTGCCTC AGCTTGCTGA TGATAAAGGG GATCTTTACT CAATCCCAGG AACACCTCCG 4 80 

TCACTTTATA CTGACCTGAA AGGGGATGCT TTTGCCTTGC GTTCTGACTA CGCAATGCAG 540 

ATTGACTTCG AACAAAAAGC TCCTCAATTC TCAGTATCAG AGACACATTG GGCTAAAACT 60 0 

TGGCTTCTTC ATGAGGATGC TCCAAAAGTA GAAAAACCAG CTGTGATTGC AAAT CT C CAT 660 

GATAAGATCC GTGAAAAAAT GGGATTTGCC CATCTGGCTG ACTAGGAGGA AG G AAAT GT C 72 0 

TGAAAAATTA GTAGAAATCA AAGATTTAGA AATTTCCTTC GGTGAAGGAA GTAAGAAGTT 7 80 

TGTCGCGGTT AAAAATGCTA ACTTCTTTAT CAACAAGGGA GAAACTT T CT CGCTTGTAGG 840 

TGAGTCCGGT AGTGGGAAAA CAACTATTGG TCGTGCTATC ATCGGTCTAA AT GAT ACAAG 900 

TAATGGAGAT ATCATTTTTG ATGGTCAAAA GATTAATGGT AAGAAATCGC GTGAACAAGC 960 

TGCGGAATTG ATTCGTCGAA T C CAGAT GAT TTTCCAAGAC CCTGCCGCAA GTTTGAATGA 102 0 

ACGTGCGACT GTTGATTATA TTATTTCTGA AGGTCTTTAC AATCACCGTT TATTTAAGGA 1080 

T GAAGAAGAA CGTAAAGAGA AAGTTCAAAG TATTATCCGT GAAGTAGGTC TTCTTGCTGA 1140 

GCACTTGACT CGTTACCCTC ATGAATTCTC AGGCGGT CAA CGTCAACGTA TCGGTATTGC 1200 

CCGTGCCTTG GTCATGCAAC CAGACTTT GT TATTGCAGAT GAGCCAATTT CAGC CTTGGA 1260 

CGTTTCTGTA CGTGCCCAAG TCTTGAACTT GCTCAAAAAA TTCCAAAAAG AGCTCGGCTT 1320 

GACCTATCTC TTCATCGCCC ATGACTTGTC GGTTGTTCGC TTTATTTCAG ATCGTATCGC 138 0 
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AGTTATTTAC AAGGGTGTTA TTGTAGAGGT TGCAGAAACA GAAGAATTGT TTAACAATCC 14 4 0 

AATTCACCCA TATACTCAAG CCTTGCTTTC AGCGGTACCA ATCCCAGATC CAATCTTGGA 15 00 

ACGTAAGAAG GTCTTGAAGG TTTACGACCC AAGTCAACAC GACTATGAGA CTGATAAGCC 1560 

ATCTATGGTA GAAATCCGTC CAGGTCACTA TGTTTGGGCG AACCAAGCCG AATTAGCACG 162 0 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 984 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

GTACCCGGGG ATCAGGTTTT ACGGATTCTT GAAGTTCTCT GTGGGCAGGA CCTCTTGCAG 60 

GTAAGAGTAA GAGTGATTCT ACAAGATTTA CTAGAAGCTA GAAAAAT GT G GCAAGCTAAT 12 0 

GTCAGCTTTC AAAATGCCAT GGAATATCTG GTCTTGAAAG AAATATAAAC TCAAAAATGA 180 

AT GAT AAAGA AAGGAAAGGG CTGTTTTATG GACAAAAAAG AATTATTTGA CGCGCTGGAT 240 

GATTTTTCCC AACAGTTATT GGTAACCTTG GCCGATGTGG AAG C CAT CAA GAAAAAT CTC 300 

AAGAGCCTGG TAGAGGAAAA TACAGCTCTT CGTTTGGAAA ATTCTAAGTT GCGAGAACGC 360 

TTGGGTGAGG TGGAAGCAGA TGCTCCTGTC AAGGCCAAGC ATGTTCGTGA AAGTGTCCGT 420 

CGCATTTACC GTGATGGATT TCACGTATGT AATGATTTTT ATGGACAACG TCGAGAGCAG 4 B0 

GACGAGGAAT GTATGTTTTG TGACGAGTTG CTATACAGGG AGT AG G CAT G CAGAT T C AAA 54 0 

AAAGTTTTAA GGGGCAGTCT CCCTATGGCA AGCTGTATCT AGTGGCAACG CCGATTGGCA 600 

ATCTAGATGA TATGACCTTT CGAGCTATCC AGACCTTGAA AGAAGTAGAT TGGATTGCTG 660 

CT GAGGAT AC GCGCAATACA GGTCTTTTGC TCAAGCATTT TGACATTTCC ACCAAGCAGA 720 

TCAGTTTTCA TGAGCACAAT GCCAAGGAAA AAATTCCTGA TTTGATTGGT TTCTTGAAAG 78 0 

CAGGGCAAAG TATTGCTCAG GTCTCTGATG CCGGTTTGCC TAGCATTTCA GACCCTGGTC 840 

ATGATTTAGT TAAGGCAGCT ATT GAG GAAG AAATTGCAGT TGTGACAGTT CCAGGTGCCT 900 

CTGCAGGAAT TTCTGCCTTG ATTGCCAGTG GTTTAGCGCC ACAGCCACAT ATCTTTTACG 960 
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GTTTTTTACC GAGAAAATCA GGTC 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1554 base paxrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

CTAGAGTCGA AAAGACAAGC AGGAGCGTAT TTCCAAAGAA AC CAT G GAAA TCTATGCCCC 60 

GCTTGCCCAT CGTTTGGGGA TTTCCAGTGT CAAATGGGAA TTAGAAGACT TGTCTTTCCG 120 

TTATCTCAAT CCAACGGAGT TTTACAAGAT TACCCATATG ATGAAGGAAA AGCGCAGGGA 18 0 

GCGTGAGGCC TTGGTGGATG AGGTAGTCAC AAAATTAGAG GAGTATACGA CAGAACGTCA 240 

CTTGAAAGGG AAGATTTATG GTCGTCCCAA G CAT ATTT AC TCAATTTTCC GCAAAATGCA 300 

GGACAAGAGA AAACGGTTTG AGGAAATCTA TGATCTGATT GCTATTCGTT GTATTTTAGA 3 60 

TACCCAAAGT GATGTTTATG CCATGCTTGG TTACGTGCAT GAATTTTGGA AACCGATGCC 42 0 

AGGTCGCTTC AAAGACTATA TCGCCAACCG CAAGGCCAAT GGTTATCAGT CTATCCATAC 4 80 

GACTGTTTAT GGACCAAAAG GGCCGATTGA ATTCCAGATT CGAACCAAGG AAATGCACGA 54 0 

GGTGGCTGAG TACGGGGTTG CGGCTCACTG GGCTTATAAG AAAGGTATAA AGGGGCAAGT 600 

TAACAGCAAG GAATCAGCTA TTGGAATGAA CTGGATCAAG GAGATGATGG AGCTCCAAGA 660 

CCAGGCTGAT GATGCTAAGG AATTTGTGGA CTCTGTTAAG GAAAACTATT TGGCTGAGGA 72 0 

GATTACCGTT TTACCCCAGA TGGAGCTGTC CGTTCCTTCC CAAAGATTCA GGACCGATTG 7 80 

ATTTTGCCTA CGAAATCCAT ACCAAGGTCG GTGAAAAGCA ACTGGTGCCA AGGTCAATGG 840 

CCGCATGGTT CCACTGACAC CCAAGTTAAA GGACAGGGGA TCAGGTTGAA ATTATCGCCA 900 

ACCCGAACTC CTTTGGACCT TAGCCGTGAC TGGCTCAATA TGGT CAAGAC TAGCAAGGCG 960 

CGCAATAAGA TTCGCCAGTT CTTTAAAAAC CAAGATAAGG AATTGTCTGT CAACAAGGGT 102 0 

CGTGAGATGC TGATGGCTCA GTTCCAAGAA AATGGCTATG TGGCAAATAA ATTTATGGAC 1080 

AAGCGCCACA TGGATCAAGT TCTGCAAAAG ACCAGTTACA AGACAGAAGA CTCCCTCTTT 114 0 

GCGGCCATTG GTTTTGGGGA AATCGGTGCG ATTACCGTCT TTAACCGTCT G AC T GAAAAG 12 00 
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GAGCGCCGTG AGGAAGAGCG TGCCAAGGCC AAGGCTGAGG CAGAGGAGCT TGTCAAAGGT 12 60 

GGCGAGGTCA AGGTTGAAAA TAAAGA E AiCT CTCAAGGTCA AGCATGAGGG GGGAGTGGTT 132 0 

ATTGAAGGTG CTT CTGGTCT CCTAGTGCGG ATTGCTAAGT GTTGTAACCC CGTGCCTGGT 1380 

GACGATATTG TTGGCTACAT TACCAAGGGT CGTGGTGTGG CTATTCACCG TGTGGACTGT 1440 

ATGAACCTGC GTGCCCAAGA AAACTAC GAG CAACGTCTCC TTGATGTGGA ATGGGAAGAC 1500 

CAGTACTCTA GCTCAAATAA G GAG TAT AT G GCCCATATCG ACTCTAGAGG ATCC 15 54 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3190 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CGTCAGTGCT AAAACAGGGG AAATTCTGGC AACAACGCAA CGACCGACCT TTGATGCAGA 60 

TACAAAAGAA GGCATTACAG AGGACTTTTT TGGCGTGATA TCCTTTACCA AAGTAACTAT 12 0 

GAGCCAGGTT CCACTATGAA AGTGATGATG TTGGCTGCTG CTATTGATAA TAATACCTTT 18 0 

CCAGGAGGAG AAGTCTTTAA TAGTAGTGAG TTAAAAATTG CAGATGCCAC GATTCGAGAT 24 0 

TGGGACGTTA ATGAAGGATT GACTGGTGGC AGAATGATGT CTTTTTCTCA AGGTTTTGCA 300 

CACTCAAGTA ACGTTGGGAT GACCCTCCTT GAGCAAAAGA TGGGAGATGC TACCTGGCTT 360 

GATTATCTTA ATCGTTTTAA ATTTGGTGTT CCGACCCGTT TCGGTTTGAC G GAT GAGTAT 42 0 

GCTGGTCAGC TTCCTGCGGA TAATATTGTC AACATTGCGC AAAGCTCATT TGGACAAGGG 4 80 

ATTTCAGTGA CCCAGACGCA AATGATTCGT GCCTTTACAG CTATTGCTAA TGACGGTGTC 54 0 

ATGCTGGAGC CTAAATTTAT TAGTGCCATT TAT GAT C C AA AT GAT CAAAC TGCTCGGAAA 60 0 

TCTCAAAAAG AAATTGTGGG AAATCCTGTT TCTAAAGATG CAGCTAGTCT AACTCGGACT 660 

AACATGGTTT TGGTAGGGAC GGATCCGGTT TATGGAACCA TGTATAACCA CAGCACAGGC 72 0 

AAGCCAACTG TAACTGTTCC TGGGCAAAAT GTAGCCCTCA AGTCTGGTAC GGCTCAGATT 780 

GCTGACGAGA AAAATGGTGG TTATCTAGTC GGGTTAACCG ACTATATTTT CTCGGCTGTT 84 0 

CGATGAGTCC GGCTGAAAAT CCTGGATTTT ATCTTGTATG TGACGGTCCA ACAACCTGGA 9 00 
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ACATTATTCA GGTATTCAGT TGGGAGAATT 
TATGAAAGAC TCTCTCAATC TTCAAACAAC 
5 AAGTCCTTAT CCTATGCCCA GTGTCAAGGA 
GCGTCGCAAT CTTGTACAAC CCATCGTTGT 
TGCTGAAGAA GGGAAGAATC TTGCCCCGAA 

1C 

AGAGGAGGTT CCAGATATGT ATGGTTGGAC 
GCTCAATATA GAACTTGAAT TTCAAGGCTC 
15 TGCTAACACA GCTATCAAGG ACATTAAAAA 
TTATTTCCAT CAGTGCTGGA ATTGTGACAT 
TTATCCAATT TTATAGAAAG GCGCAAATTA 

20 

AGCATCAGGC AAAAGCTGGG ATTCCTACAA 
TTTTGGTTGC TTTCTTTTTC GCCCTATTTA 

2 5 TTTTGTTCAT CTTGGTCTTG TATGGCTTGG 

TTCGTAAAAT CAATGAGGGG CTTAATCCTA 
GAGTTATCTT CTATCTTTTC TATGAGCGCG 

30 

CAGTTCATTT GGGATTTTTC TATATTTTCT 
ACGCAGTAAA CTTGACAGAC GGTGTTGTAC 

3 5 TTGTTTGCCT AT GGAGTTAT TGCCTATGTG 

CTTGCCATGA TTGGTGGTTT GCTCGGTTTC 
TTTATGGGTG ATGTGGGAAG TTTGGCCCTA 

40 

CTCCACCAGG AATGGACTCT CTTGATTATC 
GTTATGATGC AAGT CAGTTA TTTCAAACTG 
45 CCTGTACATC ACCATTTTGA GCTTGGGGGA 
TGGAAGGTTG ACTTCTTCTT TTGGGGAGTT 
ATTTTGTATT TGATGTAAGA ATGGCACCCT 

50 

CACAATGAAA AT CAAAGAAC AAAC T AGAAA 
ATTGAAACTA GAATAGTACA CCTCTACTTC 
55 CCTGAACGAT TTATCCTGTT CTTATTTCAT 
GGCGAAGCTG ATGTGGTTTG AAGAGATTTT 
TGACGATAGC AAGAACTACC CTACTCGATA 

60 

ATTTTAAGCA TTTGACAAAT CTAGCAACAA 



-87- 

TGCCAATCCT ATCTTGGAGC GGGCTTCAGC 960 

AGCTAAGGCT TTGGAGCAAG TAAGTCAACA 102 0 

TATTTCACCT GGTGATTTAG CAGAAGAATT 1080 

GGGAACAGGA ACGAAGATTA AAAACAGTTC 114 0 

CCAGCAAGTC CTTAT CTT AT CTGATAAAGC 12 00 

AAAGGAGACT GCTGAGACCC TTGCTAAGTG 12 60 

GGGCTCTACT GTGCAGAAGC AAGATGTTCG 132 0 

AATTAC AT TA ACTTTAGGAG ACTAATATGT 138 0 

TTTTACTAAC TTTAGTAGGA ATTCCGGCCT 14 4 0 

CAGGCCAGCA GAT G CAT GAG GATGTCAAAC 1500 

TGGGAGGTTT GGTTTTCTTG ATTACTTCTG 1560 

GTAGCCAATT CAGCAATAAT GTGGGAATGA 162 0 

TCGGATTTTT AGATGACTTT CTCAAGGTCT 168 0 

AGCAAAAATT AGCTCTTCAG CTTCTAGGTG 1740 

GTGGCGATAT CCTGTCTGTC TTTGGTTATC 18 00 

TCGCTCTTTT CTGGCTAGTC GGTTTTTCAA 18 60 

GGTTTAGCTA GTATTTCCGT TGTGATTAGT 1920 

CAAGGTCAGA TGGATATTCT TCTAGTGATT 1980 

TTCATCTTTA ACCATAAGCC TGCCAAGGTC 2040 

GGTGGGATGC TGGCAGCTAT CTCTATGGCT 2100 

GGAATTGTGT ATGTTTTTGA AACAACTTCT 2160 

ACAGGTGGTA AACGTATTTT CCGTATGACG 2220 

TTGTCTGGTA AAGGAAATCC TTGGAGCGAG 22 8 0 

GGGCTTCTAG CAAGTCTCCT GACCCTCGCA 2 340 

GAT GTTTCAG GGTGTTTTTG TGTTTAAATA 2 4 00 

GCTAACTTTA GGCTGCTCAA AATATAATAT 24 60 

TAAAACATTG TTAGAAATCG ATTTGACTGT 2520 

TTTACTATAC AGTTTCGAGG TTGTAGATAA 2580 

CTGAAAAGTG TTAACACCTA CAGACAAGCC 2 640 

GGTATCGGCT TTTGCTTTCT GAAAAAAATT 27 00 

AAAATTCTAT AAATATAATA GATTGAAACT 2 760 
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AGAATAGTAC ACATCTACTT CTAAAACATT GTTAGAAATC GATTTGACTG TCCTGATCGA 2 820 

TTTGTCCTGT TCTTGTTTCA TTTTACTATA TTTCTATGAT AAAACGCATA GTATCAAGTT 2 880 

TTCTTAATCC CCTGATACTA TGCGT GTTTG TAATTTTTAA GATTTTGTGC TTAGAGTCGA 2 94 0 

CTCCTTATTT TAGATATTTA AAAGGAATCT CACTTCCACA GAGCCAGTTG TAGACTTGGT 3 000 

CATTAACAAA TACATTCATG GCTTCGTGAG CATACTCAGG CAT G ATA C GA TAGGTTTTAT 3060 

CGCAGGTCAG ACGATTATAA ATCGCAAACT GGGTAATGGG ATAGCAAACA TCGTCGTCCA 3120 

AGCCCGTAAT CATCTTAACC TCACCTTGGA TACGATGGGC AAGATTTTTG ACATCGACTC 318 0 

TAGAGGATCC 3190 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 5992 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: 

TTGTTCTTAG TGTTCCGACA AAGATTCTTC AAAATCAAAT CAT G G AAGAA GAAGGTAAAC 60 

GTCTCAAGGA AGTGTTCCAT AC AGATAT T C AT AG CTTAAA GGGACCACAA AATTATCTGA 12 0 

AGTTGGATGC CTTTTATCAT TCTTGCAGGA AAATGATGAA AATCGCTTAT TTAGACGCTT 180 

TAAAATGCAA GTCTTGGTCT GGCTTACTGA GACAGAGACA GGAGATTTGG ATGAAATCGG 24 0 

GCAACTCTAC CGTTACCAAC ATTTTCTAGC AGACCTTCGT CATAAT GGGA ATTTATCATC 300 

CCAGAGCTTA TTTGTGACGG AAGATTTTTG GAAACGTAGT CAAGAAAGGG CAGAGACTTG 3 60 

CAAGCTTTTA GTGACTAATC ATGCCTATCT CGTAACCAGA CTT GAAGATA ATCCTGAATT 42 0 

TGTCAGTGAC CGTTTACTGA TTATTGATGA AGTCCAAAAG ATTTTGTTAG CTCTAGAAAA 4 80 

TCTGCTTCAA GAGACCTACG ATATACAATC TATTATCGAT TTAATTGATA AGGCTTTAGT 54 0 

AGGAGAAGAA AACAGGGTTC AACAAC GGAT AC TAGAAAGT ATTCGCTTTG AGTGCCT CTA 600 

CTTGATAGAA CAATTTCAGT CTGGCAAATC TAGGAAAAAT ATCTTAGATT CTCTGGACAA 66 0 

TCTCCATCAG TATTTTTCAG AATTAGAAGT GGAAGGCTTT GATGAGCTGG TTCGCTATTT 72 0 

TACAGCTGAA GGTGATTACT GGCTTGAAGT AACTGAAACG AGT CAAAAGA AAATT CAGAT 7 80 
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TTCTTCTACA AAATCAGGCC GTACTCTTCT GTCCTCTTTA CTTCCTGAGA GTTGCCAAGT 84 0 

CTTGGGAGTA TCGGCTACTC TTGAGATTAG TCAGAGGGTT TCTTTGGCAG ACCTTTTAGG 9 00 

5 

CTATCCTGAA GCCAAATTTG TCAAGATTGA ATCTCGGGGA AAACAGGAAC AAGAAGTGGT 960 

TATGGTCAAA GATTTCCCTC TGGTAACAGA AACCTCCTTA GAAGTCTATG CCAGAGAGGT 102 0 

10 AGCTGCTTTA CTAGTGGAAA TTCAAGCTTT CCAGCAACCG ATTTTGGTTC TCTTTACCGC 1080 

TAAAGACATG CTTCTAGCAG TATCGGATTT ACTTACAGTT AGCCACTTGG CCCAGTATAA 114 0 

AAATGGGGAT GTTCATCAGC TAAAGAAACG CTTT GAAAAA GGTGAACAAC AAATCTTGCT 12 0 0 

15 

TGGTGCAGCA AGTTTCTGGG AGGGAGTTGA TTTTTCAAGC CATCCTTTTG TGATTCAAGT 126 0 

TGTACCGAGG CTTCCTTTCC AAAATCCTCA AGAACCCTTG ACGAAAAAGA TTAATCAAGA 132 0 

2 0 ACTGAATCAA GAAGGGAAAA ATGCCTTTTA TGATTATCAA TTGCCAATGG CCATTATTCG 1380 

TTTAAAACAG GCTTTGGGAA GAAGTAT GAG ACGTGAATAC CAACGTTCCT TAACTCTTAT 144 0 

TTTGGATAGG AGAATCATCG GAAAAC GAT A CGGCAAACAA ATAGTAGCAT CTCTAGCAGA 1500 

25 

AGAAGCGACT GTTAAAACCA TCTCTCGATC CGAAGTTGAC GAGGCTATTG ATAGATTTTT 1560 

TAATGAACTT TGATAAATAG TATTGTATGA AAGT ATAAG G TTAGTACATA TGAAACGTTC 162 0 

3 0 TCTCGACTCT AGAGTCGATT ATAGTTTGCT CTTGCCAGTA TTTTTTCTAC TGGTCATCGG 1680 

TGTGGTGGCT ATCTATATAG CCGTTAGTCA TGATTATCCC AATAATATTC TGCCCATTTT 174 0 

AGGGCAGCAG GTCGCCTGGA TTGCCTTGGG GCTTGTGATT GGTTTTGTGG TCATGCTCTT 180 0 

35 

TAATACAGAA TTTCTTTGGA AGGTGACCCC CTTTCTATAT ATTTTAGGCT TGGGACTTAT 1860 

GATCTTGCCG ATTGTATTTT ATAAT CCAAG CTTAGTTGCA TCAACGGGTG CCAAAAACTG 192 0 

4 0 GGTATCAATA AATGGAATTA CCCTATTTCA ACCGTCAGAA TTTATGAAGA TATCCTATAT 198 0 

CCTCATGTTG GCTCGTGTCA TTGTCCAATT TACAAAGAAA CATAAGGAAT GGAGACGCAC 204 0 

GGTTCCGCTG GACTTTTTGT TAATTTTCTG GATGATTCTC TTTACCATTC CAGTCCTAGT 2100 

45 

TCTTTTAGCA CTTCAAAGTG ACTT GGGGAC GGCTTTGGTT TTTGTAGCCA TTTTCTCAGG 2160 

AATCGTTTTA TTATCAGGGG TTTCTTGGAA AATTATTATC CCAGTATTTG TGACTGCTGT 222 0 

5 0 AACAGGAGTT GCTGGTTTCT TAGCTATCTT TAT TAG CAAG GACGGACGAG CTTTTCTTCA 228 0 

CCAGATTGGA ATGCCGACCT AC CAAAT CAA TCGGATTTTG GCTTGGCTCA ATCCCTTTGA 2 34 0 

GTTTGCCCAA ACAACGACTT ACCAGCAGGC TCAAGGGCAG ATTGCCATTG GGAGTGGTGG 2400 

55 

CTTATTTGGT CAGGGATTTA ATGCTTCGAA TCTGCTTATC CCAGTTCGAG AGT CAGATAT 24 60 

GATTTTTACG GTTATTGCAG AAGATTTTGG CTTTATTGGC TCTGTCCTGG TTATTGCCCT 252 0 

60 CTATCTCATG TTGATTTACC GTATGTTGAA GATTACTCTT AAATCAAATA ACCAGTTCTA 25 80 
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CACTTATATT TCCACAGGTT T GAT TAT GAT 
TGCTGTGACT GGACTACTTC CTTTGACGGG 
5 ATCAGCGATT ATCAGTAATC TGATTGGTGT 
TAATCTAGCT GAAGAAAAGA GCGGAAAAGT 
ACAAATTAAA TAAGGAGAAA AT CAT G GT AA 

10 

AAGAAATTGA AGCCTTGACA GTTGTAGATG 
TGGTTGGTTT TGAAGAGCAA GTAACGGGTT 
15 TCTTTGATGG AGATTTATCA GACTATGATA 
CTGCACATTT ACGTGATAAT CAGACCTTGA 
GGAAGAAACT AGCAGCCATT TGTGCGGCAC 

20 

AAAAT AAG C G ATACACTTGT TATGACGGCG 
TCAAGGAAAC AGTAGTGGTA GATGGTCAGT 

2 5 TTGCCTTTGC CTACGAGTTG GTGGAGCAAC 

GAATGCTCTA TCGAGATGTC TTTGGGTAAA 
TTTTATGTGG AAAACTCAGG GAAATCATCG 

30 

GGT AT GAAAT ATCACGATTA CATCTGGGAT 
ACTTCAACAG CTGCATTT GT TGAAACATTG 

3 5 AGTGTCTATC AAGCTTTAAA GGTTTCTACT 

TTAGAGAATT TTTTAGAAAA GTACAAGGAA 
TTATTTGAAG GAGTTTCTGA CCTATTGGAA 

40 

TTGGTCTCTC ATCGAAATGA TCAGGTTTTG 
TATTTTACAG AAGTGGTGAC TTCTAGCTCA 
45 ATGCTTTATT TAAGAGAAAA GTAT CAGATT 
AT T GAT AT C G AAGCAGGTCA AGCTGCAGGA 
AATTTAAGAC AAGTATTAGA CATATAAGAA 

50 

ATCTGCAGGC ACAGGATTAT GATGCCAGTC 
TTCGTATGCG TCCAGGGATG TACATTGGAT 
55 TCTGGGAAAT TGTT GATAAC TCAATTGACG 
AAGTTTTTAT TGAGCCAGAT GATTCGATTA 
TCGATATTCA GGAAAAAACA GGTCGTCCTG 

60 

CTGGAGGAAA GTTCGGCGGT GGTGGATACA 
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GTTGCTCTTC CACATCTTTG AGAATATCGG 2 64 0 

GATTCCCTTG CCTTTCATTT CGCAAGGGGG 27 00 

TGGTTTGCTT TTATCGATGA GTTACCAGAC 27 60 

CCCATTCAAA CGGAAAAAGG TTGTATTAAA 2 82 0 

AAGTAGCAGT TATGTTAGCT CAGGGCTTTG 2 88 0 

TCTTGCGTCG AGCCAATATC ACATGTGATA 2 94 0 

CGCATGCAAT CCAAGTAAGA GCAGATCATG 3 00 0 

TGATTGTTCT TCCTGGAGGT ATGCCTGGTT 306 0 

TTCAAGAATT GCAAAGCTTC GAGCAAGAAG 312 0 

CAATTGCCCT CAATCAAGCA GAGATATTGA 318 0 

TTCAAGAGCA AATCCTTGAT GGTCACTACG 324 0 

TGACAACCAG TCGGGGTCCT TCAACAGCCC 330 0 

TAGGAGGGGA CGCAGAGAGT TTACGAACAG 3 3 60 

AATCAGTAAA ACGGGAGTTA TTCTCTCGTT 3420 

CTTTTTTCAT AAAAAAATGC T AT AAT GAAG 3480 

TTAGGTGGAA CTTTACTGGA TAATTATGAA 354 0 

GCACTGTATG GTATCACACA AGACCATGAC 3 60 0 

CCTTTTGCGA TTGAGACATT CGCTCCCAAT 3660 

AAT GAAG C CA GAGAGCTTGA ACACCCGATT 3720 

GACATTTTAA ATCAAGGTGG CCGTCATTTT 3780 

GAAATTTTAG AAAAAACCTC TATAGCAGCT 3840 

GGCTTTAAGA GAAAGCCAAA TCCCGAATCC 39 00 

AGCTCTGGTC TTGTCATTGG TGATCGGCCG 3960 

CTTGATACCC ACTTGTTTAC CAGTATCGTG 4 02 0 

AAAGGAATAA GATGACAGAA GAAAT CAAAA 4 080 

AAATTCAAGT TTTAGAGGGC TTAGAGGCTG 414 0 

CAACCTCAAA AGAAGGTCTT CACCATCTAG 4200 

AGGCCTTGGC AGGATTTGCC AGCCATATTC 4260 

CTGTTGTGGA TGATGGGCGT GGTATCCCAG 4 320 

CTGTTGAGAC CGTCTTTACA GTCCTTCACG 438 0 

AGGTTTCAGG TGGTCTTCAC GGGGTGGGGT 44 4 0 
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CGTCAGTTGT TAATGCCCTT TCCACT CAAT TAGACGTTCA TGTCCATAAA AACGGTAAGA 45 00 

TTCATTACCA AGAATACCGT CGTGGTCATG TTGTCGCAGA TCTTGAAATA GTTGGAGATA 4 5 60 

5 

CGGATAAAAC AGGAACAACT GTTCACTTCA CACCGGACCC AAAAATCTTC ACTGAAACAA 4 62 0 

CAATCTTTGA TTTTGATAAA TTAAATAAAC GGATTCAAGA GTTGGCCTTT CTAAATCGCG 4 68 0 

10 GTCTTCAAAT TTCTATCACT GATAAGCGCC AAGGTTTGGA ACAAACCAAG CAT TAT CAT T 474 0 

ATGAAGGTGG GATTGCTAGT TACGTTGAAT ATATCAACGA GAACAAGGAT GTAATCTTTG 4 8 00 

ATACACCAAT CTATACAGAC GGTGAGATGG ATGATATCAC AGTTGAGGTA GCCATGCAGT 4 8 60 

15 

ACACAACGGG TTACCATGAA AAATGTCATG AGTTTCGCCA ATAATATTCA TACACATGAA 4 92 0 

GGTGGAACGC ATGAACAAGG TTTCCGTACA GCCTTGACAC GTGTTATCAA CGATTATGCT 498 0 

2 0 CGTAAGAATA AGTTACT GAA AGACAAT GAA GACAATCTAA CAGGGGAAGA TGTTCGCGAA 504 0 

GGCTTAACTG CAGTTATCTC AGT T AAA C AC CCAAATCCAC AGTTTGAAGG ACAAACGAAG 5100 

ACCAAATTGG GAAATAGCGA AGTGGTCAAG ATTACCAATC GCCTCTTCAG TGAAGCCTTC 5160 

25 

TCCGATTTCC TCATGGAAAA TCCACAGATT GCCAAACGTA TCGTAGAAAA AGGAATTTTG 5220 

GCTGCCAAGG CTCGTGTGGC TGCCAAGCGT GCGCGTGAAG TCACACGTAA AAAATCTGGT 52 8 0 

3 0 TTGGAAATTT CCAACCTTCC AGGGAAACTA GCAGACTGTT CTTCTAATAA CCCTGCTGAA 5340 

ACAGAACTCT TCATCGTCGA AGGAGACTCA GCTGGTGGAT CAGCCAAATC TGGTCGTAAC 540 0 

CGTGAGTTTC AGGCTATCCT TCCAATTCGC GGTAAGATTT TGAACGTTGA AAAAGCAAGT 54 60 

35 

AT G GAT AAGA TTCTAGCTAA CGAAGAAATT CGTAGTCTTT TCACAGCCAT GGGAACAGGA 552 0 

TTTGGCGCAG AATTTGATGT TTCGAAAGCC CGTTACCAAA AACTCGTTTT GATGACCGAT 558 0 

4 0 GCCGATGTCG ATGGAGCCCA CATTCGTACC CTTCTTTTAA CCTTGATTTA TCGTTATATG 564 0 

AAACCAATCC TAGAAGCTGG CTATGTTTAT ATTGCCCAAC CACCAATCTA TGGTGTCAAG 57 0 0 

GTTGGAAGCG AGATTAAAGA AT AT AT CC AG CCGGGTGCAG ATCAAGAAAT CAAACTCCAA 57 60 

45 

GAAGCTTTAG CCCGTTATAG TGAAGGTCGT ACCAAACCGA CTATTCAGCG TTATAAGGGG 582 0 

CTAGGTGAAA TGGACGATCA TCAGCTGTGG GAAACAACCA TGGATCCCGA ACATCGCTTG 5880 

5 0 ATGGCTAGAG TTTCTGTAGA TGATGCTGCA GAAG CAGAT A AAATCTTTGA TATGTTGATG 594 0 

GGGGATCGAG TAGAGCCTCG TCGTGAGTTT ATCGACTCTA GAGGATCCCC GG 5 992 
(2) INFORMATION FOR SEQ ID NO: 40: 

55 

(X) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 907 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
60 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DKA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 0 : 

TACAAAAGTA GGTGGAGAGG CTGATTATTT GGTCTTTCCA CGAAATCGTT TTGAGTTGGC 60 

TCGCGTTGTG AAATTTGCCA ACCAAGAAAA TATCCCTTGG ATGGTTCTTG GCAATGCAAG 120 

C AAT AT CAT C GTTCGTGATG GTGGGATTCG TGGATTTGTC ATCTTGTGTG ACAAGCTCAA 180 

TAACGTTTCT GTTGATGGCT ATAC CATTGA AGCAGAAGCT GGGGCTAACT TGATTGAAAC 240 

AACTCGCATT GCCCTCCGTC ATAGTTTAAC TGGCTTTGAG TTTGCTTGTG GTATTCCAGG 300 

AAGCGTTGGC GGTGCTGTCT TTATGAATGC GGGTGCCTAT GGTGGCGAGA TTGCTCACAT 360 

CTTGCAGTCT TGTAAGGTCT TGACCAAGGA TGGAGAAATC GAAACCCTGT CTGCTAAAGA 42 0 

CTTGGCTTTT GGTTACCGCC ATTCAGCTAT TCAGGAGTCT GGTGCAGTTG TCTTGTCAGT 4 80 

TAAATTTGCC CTAGCTCCAG GAAC C CAT CA GGTTAT CAAG CAGGAAATGG ACCGCTTGAC 54 0 

GCACCTACGT GAACTCAAGC AACCTTTGGA ATACCCATCT TGTGGCTCGG TCTTTAAGCG 600 

TCCAGTCGGG CATTTTGCAG GTCAGTTAAT TTCAGAAGCT GGCTTGAAAG GCTATCGTAT 660 

CGGTGGCGTA GAAGTGTCAG AAAAGCATGC AGGATTTATG ATCAATGTCG CAGATGGAAC 7 20 

GGCCAAAGAC TACGAGGACT TGATCCAATC GGTTATCGAA AAAGTCAAGG AACACTCAGG 7 80 

TATTACGCTT GAAAGAGAAG TCCGGATCTT GGGTGAAAGC CTATCGGTAG CGAAGATGTA 84 0 

TGCAGGTGGT TTTACTCCCT GCAAGAGGTA GTGGGGACCT GACAGAGCCC CGATCGGTTA 900 

AGCTATG 9 07 
(2) INFORMATION FOR SEQ ID NO : 4 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2764 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 
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AGAACCCTTG GATGCAGCCA TTCAGAAGA? 
CTTTAAATCA CGTGAAATGT TGCTAGAATG 

5 

TTTGGCAAAA CTAATCAGCC ATCTTGGAAT 

CGAGGCCAAT GACCTCTCTA TGATTGAATG 
10 TGTTCCTGAA GTAAAGGCAG CCGCAAATGT 

TGTCGCCTGG GCTATCGAAG AATATGTGCT 

CCGTCTATTC GGAAAAAAAG AAGAACCTAA 

15 

AAATCTTGAT TTGTCTGAAG ATGTTGATCC 

GGAAGAAGCA GAGGTTGAAA TTGTTGAACA 

2 0 CACAGTTGAA GAAAGTCTGG ATTTAGAGCC 

AGAATTTCCA CACTCAGAAG AAG GGAAT AC 

TTCTGAAGTT CTTGAACCAG AAAGGCCTCA 

25 

CCGCAGTCTT AAGAAAACTC GTACAGGTTT 

CTTCCGCTCT GTTGACGAAG AATTTTTCGA 

3 0 TGTTGGTGTC CAAGTCGCTT CTAACTTAAC 

AAATGCCAAG AAACCTGATG CACTTCGTCG 

TGAAAAGGAT GGTAGCTACG AT GAAAGCAT 

35 

CTTTGTTGGT GTGAATGGTG TTGGGAAAAC 

CAAACAAGCT GGTAAGAAGG TCATGCTGGT 

4 0 AGCTCAGCTA GCTGAATGGG GCCGACGAGT 

AGCTGATCCA GCCAGCGTGG TCTTTGATGG 

TATTCTCATG ATTGATACTG CTGGTCGTCT 

45 

G G AAAAGAT T GGTCGTATTA TCAAACGTGT 

GGCACTTGAT GCATCAACAG GTCAAAATGC 

5 0 CACACCTTTA ACGGGAATTG TTTTGACTAA 

TCTAGCCATT CGTGAAGAAC TCAATATTCC 

CGATGATATT GGAGAGTTTA ACTCAGAAAA 

55 

CTAATCAGAA GCAAAAATCC TGCAAGGCAT 

GACCATCTTG ACGATAGGTG ATATCTGGTT 

6 0 GTAGGTCAAA GCTGGCTTGA GGTCCCATGC 
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TTCTCCAGAA TTGTTTGACC AATATGAAAT 6 0 

GTCACCAAAG AATGTTCATA AAGCAACAGG 12 0 

CGACCAAAGT CAAGTGATGG CTTGTGGTGA 180 

GGCAGGTCTT GGTGTTGCTA TGCAAAACGC 24 0 

AGTGACGCCG ATGACCAACG ATGAGGAAGC 30 0 

AAAGGAGAAC TAAGATATGG GATTGTTTGA 3 60 

AATCGAAGAA GTT GTAAAAG AAGCTCTGGA 42 0 

TACCTTCACA GAAGTTGAGG AAGTTTCTCA 480 

AGCTGTGTTC CAAGAAGAGG AAATCCAAGA 54 0 

AGTTGTAGAA GTTTCTCAAA AAGAAGTCGA 600 

TGAGTTTCTA GAGACTATAG AAGAAAATAA 660 

AGCAGAAGAA ACCGTTCAGG AAAAAT AT G A 72 0 

CGGTGCCCGC TTGAATGCCT TCTTTGCTAA 78 0 

GGAACTGGAA GAACTGCTGA TTATGAGTGA 84 0 

GGAGGAACTA CGTTACGAAG CCAAGCTTGA 900 

TGTCATCATT GAGAAATTGG TTGAGCTTTA 960 

CCACTTCCAA GATAACTTGA CAGTTATGCT 102 0 

AACTTCTATC GGAAAACTAG CCCACCGCTA 1080 

TGCAGCAGAT ACCTTCCGTG CGGGTGCAGT 114 0 

AGATGTTCCA GTAGTAACTG GACCTGAAAA 1200 

TATGGAACGT GCCGTGGCTG AAGGTATCGA 1260 

GCAAAATAAG GATAACCTTA TGGCTGAGTT 1320 

TGTGCCAGAA GCACCACATG AAACCTTCTT 138 0 

CCTAGTACAG GCCAAAGAAT TTTCGAAAAT 144 0 

GATTGATGGA ACTGCTCGAG GAGGTGTGGT 15 00 

TGTAAAATTG ATTGGTTTTG GTGAAAAAAT 1560 

CTTTAT GAAA GGTCTCTTGG AAGGTTTAAT 162 0 

AAACTTGCAG GAAATTTTTT T ATT C T AAG C 168 0 

GCCAAGTCCA TTTGGCACCG AATTTTTCAA 1740 

TTCCAGCTTT ATAGTCATGA AGTGGGGCAC 1800 
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CATTTTCAGC CCAGAGCTTT TCAATACGGT CAATCAACTT CCATGACGCA C AAAC T T CAT 1860 

CCCAGTGGCT AAAGTTAGTT GAGTTGTTAT TTAGGACATC AT AAAT CAAT TTTTCGTATG 192 0 

GTTCTGGAGA AGCACCAGTT GCAGTCGCAT CTGTACGGTA ATCAAGTGAG TTAGGAGCCA 198 0 

AGTTAAATTC TTCTCCTACT TGCTTCCCAT TTAGGCTAAG AGAGAAGCCT TCTGTTGGTT 2040 

GAATATAGAT GGTCAAAATA TTTGGAGCAA GTGGTTCTCC AAAGATAGAA TCCATTTGTT 2100 

TAAAGACGAT GTTGACATGA GTTCCTTTTT CAGTCAGTCG TTTACCTGTA CGGAAAAAGA 2160 

AAGGAACACC ACGGAATCGA TCGCTGTCTA CAAAGAAGGC ACCAGATGTA AAGGTTTCAG 2220 

TTGTTGATTC TGGATTCACA TTTGGCTCGC TACGATAAGA GATGTATTTC ATGCCATCAA 22 80 

TCTTACCAGA GCGGTATTGC CCACGGATAA AGTGTTCTTT GAGTTCTTCA TCAGTTGGAT 2 340 

GATAGAGGTT TTTAAAGACC TTAATCTTTT CAGCACGAAT CTCGTCTTTT GTGAAGCTTG 2 4 00 

CTGGTTTGTC CATGGCGAGG AG C GAAAG AA GTTGTAGAGT GTGGTTTTGG ACCATGTCAC 24 60 

GGAGGGCACC GGATTGGTCA TAGTAGCCAC CACGTTCTTC TACACCCAAG CTCCGCAAAG 2520 

GTAATTTGAA CATTGTCGAA AAATCCTTGT TCCAAACGTT TTCAAAAATC AAGTTTGCAA 2 5 80 

AGCGAACTGC AAAGATGCTT TGGATCATTT CCTTACCAAG ATAATGGTCG ATACGGAAAA 2 64 0 

TTTGTTCTTC GTCAAATGTT GCTAGGAGTT CGTCATTCAA CTTGTTTGCA GTTGCGTAAT 27 00 

CTGTACCAAA TGGTTTTTCA ACGATCAAGC GCTCAAAACC TTTGCCATCG ACT CTAGAGG 2760 

ATCC 2764 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3189 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

ACACTGTTTA CGATGATGGT CGTATTGATT ACGTGAAAAA CACTTGGAGA TCTTGTCTGA 60 

TGCGATTGCA GATGGAGCTA ATGTAAAAGG TTACTTCATT TGGTCATTAA TGGATGT CTT 12 0 

CTCATGGTCA AACGGTTATG AGAAACGTTA TGGTCTCTTC TACGTAGATT TTGAAACTCA 180 

AGAACGTTAT CCTAAGAAAT CAGCTCACTG GTACAAGAAA GTAGCGGAAA CTCAGATTAT 24 0 
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AGACTAGTAG AATTAGTCAT TAGATATAGA 
TTTTATCCAA TCTATTTATG AAAAAGTTTA 
5 ACCGTGTTTG ACGAGTGAAG AATTGAAAGT 
GAATGGATTT GTCATTCAGA TGATGAGCTG 
ATCAATCCTG AAGAATGGGA TACTATCTCC 

10 

TCGTAACCAA TTT CTCAAAA AAGTTAAATC 
TAATCACGTT GCTTATACTC AATGAAAATC 
15 ACACCTGATA CTATGCTTTT TATTGTGGGA 
TGTTACCCAG GCTCTTTCAG TTTATTAAGG 
AAAAAGGATT GAATCACTTA GTTTAGAATC 

20 

ATAAAAAGTA TAAAAATCAA ACTTATTGAA 
ATTAGAATTA TTTAAAGCGA TGCGTTGAGC 

2 5 AGCTCCGTTT TGAATACCAT TACAGCTAAC 

ATTTTGTAGG GTCAATGTGC CAACAAAAGC 
CAAAATCAAA TCTGTTAATT TTCGTTCGCT 

30 

TACGACGCGG ATATT GTCAA TAGGCAACTC 
TCCAATGAAA ATAGTTTCTC TTTCTTCTAC 

3 5 TTTTTCTGCC GTTTGGAGGG CTTGTTTTTC 

GTTGGTTCGA AGTTTTTCAG CTCCACCATG 
TTCCTGTAAA TAGCGCCTTG CAGTCATATC 

40 

AACTGTTATG GTTCCTTTAC TATTTACTAT 
GAGCATATTA TCACCTCGTT TCCTACTACT 
45 TACAAAAACA ACAAAATGAA ACAAAAACAA 
ATATTTTTGT TGGGTTATAA CTTTGATGTT 
TGGAGAATTA GTCTAAACCG TAGTTATAGT 

50 

GAAGGTAACC ATTTCCGACT T G AG AGAAG A 
CGTTCATAAC GATTGGGTTG ACATCTTCAG 
55 TGTTCATGAG A G CT TT ATT G GCATTGTAGC 
CAACACCGTC ATAAAGACTC TCTGTGTAGC 
GGTCGTACAT CCATTCTTTG AGTTTTTCTT 

60 

GTTGGAATTT GTAACCAATG TAGGTTCCGT 
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ATTTTAGTGA GTCCAAAAGA TGTTCAAAGA 30 0 

TATTATAAAT TTCGAAAAAT GCTCTCAAAT 360 

CTTGGAAAAT GGTATGTCTC GACTGGTAAA 42 0 

GAAGAATTTA AAAATCTAT7 TTTAAATTTT 480 

TTTGATTCAG ATTTTATGCC GTTTCAACAA 54 0 

TTATATTTAG TACTCTGTAA AACTCTTATC 600 

ATAGAAAAAA GCATAGTATC AGGTGTTGAA 660 

AGATTTACTT TTTTTCTTCT GAAATTGAGT 72 0 

CTTGATGACT TTAATGTGTT TAGATAGCTT 78 0 

TGAAACAATA GTAT CAAGAT TT GAT AT ATT 84 0 

CTTACTATGA TCTGCGAGTA AATATTTTTT 900 

CTCTCCCTCT TCCTCGCTAA AAGTAGCTAG 960 

GAAAGCTTTA GAAAATT GGA GATTAGAGAG 102 0 

ACCTGTAATA TCGCGATAAT TTCCACCTAT 1080 

TAAAATCAGA AAAACAGGTA GACTGTTGGT 114 0 

ACGCGCAAAA AACTCTAATG TTGTTCCTGG 12 00 

TAGACTGCCT GCAAAAT GGG CTATTTCTTG 12 60 

AATATTTGAT CGCTCATTAG TCAAAAGGGA 132 0 

CACACGAATC AGCAAATCTT TATCAGCTAA 13 8 0 

TGAAACGGCT ATTTCGTCCA TAATCTGTTT 144 0 

CT CTAAAATT TTGGCTAATT TTTCTTGTTT 1500 

ATCTTACCAT AAACAAACTC ATCATTCAAA 1560 

AAATATCGAA GTTTGTTTTC AAAACTTTCG 162 0 

TCTAGTTTAC TTTTTGATGA TTGAGAGTGA 168 0 

CATCGTCTTG CATGGCTTCA ACTTCGCCAA 174 0 

AGTCATGGTT GGAAGTTCCT GTTGAAATAC 18 0 0 

CTGAATCTGG GAAAAGTGGA TCTTGTCCCA 18 60 

GAAGGAAGGT TTTAACCTCT TCAGTCCAAC 192 0 

CTTCTTCATT TT CAT AAAG A GTATAGAGTA 19 8 0 

GCTCTTCTTC AGGCAATTCA TTGAAACCAA 2 04 0 

GAACAGACTC GTCACGAATA AT CAAT TTAA 210 0 
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TGATTTCTGC AACGTTGGCT AGTTTGTTGT TACCGAGATA GTAGAGGGGA GTGAAGAAAC 2160 

CAGAGTAGAA GAGGAAGGTT TCGAGGAAGA CGCTGGCAAC TTTCTTTTCA AGTGGGCTGC 2220 

5 

CGTTTAGGTA GATTTCGTTG ACAATCTCAG CCTTCTTTTG TAGGTAAGGA TTGGTATTGG 22 8 0 

TCCATTCGAA AATTTCTTCA ATCTCAGCCT TAGTATTCAA GGTAGAAAAG ATTGATGAGT 2340 

10 AAGATTTAGC GTGGACAGAT TCCATAAATT G GAT GT TAT T GAAAACAGCT TCCTCATGTG 24 0 0 

GTGTACGGAT GTCTGCGCGA AGGGCTTGAA CCCCAGTTTC A GAT TG CAT A GTGTCAAGAA 24 60 



GGGTTAAACC ACCAAAAACT TTTCCGACCA AGTCTTTCTC TTT GTTAGAT AGCTTTCTCC 252 0 

15 

AGTCATCCAA GTCGTTTGAT AAGGGAATAC GTGTATCGAG CCAAAATTGC TCCGTCAGTT 25 8 0 

TTTCCCAAGT TGATTTGTCG ATGACATCTT CGATGGCATT C CAGTTAAT G GCTTTGTAGT 264 0 

2 0 AAGTTTCCAT TTAAAATCTC TTTCTGTGTT TAGTATTGCG AACTCACAAT TATTTCTACT 27 00 

TTACCATAAT TCTATAGGAG TATCGCACAA AAAGTCGGAA GCCCGACTTT TAAAATGTTA 27 60 

CATAAATTAT GTTATGACAT AGTAGATTTG ATTTTATCAG TGCTGCTTAG GGAAAAATAA 282 0 

25 

TGTTTCTATG CTAGAAACTA AAT CACACAG CTTTCACATT GGTTGGCGCC GACTTCTCCA 28 8 0 

CCGTCATCTG TAAAGGTACG GACGTAGTAG ATAGACTTGA TTCCCTTGTT AAAG G CAT AG 2 94 0 

3 0 TTACGAAGGA TGGACAAGTC ACGTGTCGTT TGTTTATTTT CCCTCTTCCA TTCGTAAAGG 3000 

CCTTTTGGAA TGTCACTACG CATGAAGAGG GTGAGTGAAA GTCCTTGATC CACGTGTTCA 3060 



GTCGCAGCAG CGTAAACATC GATGACTTTA CGCATATCCA TATCGTAGGC AGAAGTGTAG 312 0 

35 

TAAGGAATGG TTTCTGTAGA CAAGCCAGCA GCAGGGTAGT AGATTTTACC AATTTTCTTC 318 0 
TCTTGGCGT 318 9 

40 (2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3580 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 0 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



55 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
TTATTGAAGA AGGTGTTAAA GTTGTCACAA CAGGAGCAGG AAATCCAAGC AAGTATATGG 60 

60 

AACGTTTCCA TGAAGCTGGG ATAATCGTTA TTCCTGTCGT TCCTAGTGTC GCTTTAGCTA 12 0 
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AACGCATGGA AAAAATCGGT GCAGACGCTG 
ATATCGGTAA ATTAACAACC ATGACCTTGG 

5 

CTGTTATTGC TGCAGGAGGA ATTGCGGATG 
GTGCAGAGGC TGTACAGGTG GGGACACGGT 
10 CAAACTACAA GGAGAAAATT TTAAAAGCAA 
ACTTTGGTCA TGCTGTTCGT GCTATTAAAA 
AAAAAGATGC CTTTAAGCAG GAAGATCCTG 

15 

GTGCCCTAGC CAAAGCAGTT GTT CACGGTG 
AAATCGCAGG GCTTGTTTCT AAAGAAGAAA 
2 0 ACGGAGCCGC TAAGAAAATT CAAGAAGAAG 
ACTAAAACAG CCTTTTTATT TGCTGGTCAA 
TTCTATGATC AGTATCCGAT TGTTAAAGAA 

25 

TATGATTTGC GTTATCTCAT CGATACGGAA 
CAACCAGCCA TTCTAGCGAC TTCGGTTGCT 
30 CAGCCTGATA TGGTTGCTGG TTTGTCTCTT 
GCCTTGGATT TTGAAGATGC GGTTGCCTTG 
GCGGCTCCTG CTGACTCTGG CAAGATGGTA 

35 

GAAGAAGCCT GTCAAAAAGC TTCTGAACTT 
CCTGCACAAA TCGTCATTGC TGGAGAAGTG 
4 0 CAAGAAGCAG GTGCCAAACG CTTGATTCCT 
CTCCTTGAAC CTGCTAGCCA GAAACTAGCT 
TTTACTTGTC CCCTAGTCGG CAATACAGAA 

45 

CAGCTCTTGA CGCGTCAGGT CAAGGAACCC 
CAAGAAGCAG GCATAAGCAA CTTTATCGAG 
50 GTTAAAAAAA TTGATCAAAC TGCTCACTTA 
GCACTTTTAG AAAAATAGAC TAAAATAAGT 
GAACATAAAA ATATCTTTAT TACAGGTTCG 

55 

AAGTTTGCTC AAGCAGGAGC CAACATTGTC 
TTGCTCGCTG AGTTTTCAAA CTATGGTATC 
60 GATTTTGCAG ACGCTAAGCG TATGATTGAT 
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TTATTGCAGA AGGAATGGAA GCTGGGGGGC 18 0 

TGCGACAGGT AGCCACAGCT GTATCTATTC 24 0 

GTGAAGGTGC TGCGGCTGGC TTTATGCTAG 3 00 

TTGTAGTTGC AAAAGAGTCG AATGCCCATC 360 
GGGATATTGA CACTACGATT TCAGCTCAGC 42 0 

ATCAGTTGAC TAGAGATTTT GAACTGGCTG 480 

ATTTAGAAAT CTTTGAACAA ATGGGAGCAG 54 0 

ATGTGGAGGG TGGCTCTGTC ATGGCAGGTC 600 

CAGCTGAAGA AATCCTAAAA GATTTGTATT 660 

CCTCTCGCTG GACAGGAGTT GTAAGAAATG 720 

GGTGCCCAGT ATCTAGGGAT GGGACGGGAT 7 80 

ACGATTGATC GAGCGAGTCA GGTGCTAGGT 840 

GAAGACAAAC TCAATCAGAC CCGCTATACG 900 

ATCTACCGTT TATTGCAAGA AAAGGGCTAT 960 

GGAGAATACT CTGCCTTGGT GGCAAGCGGC 102 0 

GTAGCTAAGC GTGGAGCCTA TATGGAAGAA 108 0 

GCAGTTCTCA ATACGCCAGT AGAGGTCATT 1140 

GGAGTGGTTA CTCCAGCCAA CTATAACACA 1200 

GTTGCAGTTG ATCGAGCGGT TGAACTTTTG 12 60 

CTTAAGGTGT CAGGTCCCTT TCACACCTCT 132 0 

GAAACTCTGG CTCAGGTAAG TTTTTCAGAT 1380 

GCTGCTGTGA TGCAAAAAGA GGACATTGCT 1440 

GTTCGTTTCT AT GAAAG TAT TGGGGTCATG 15 00 

ATTGGACCGG GGAAAGT CTT GTCAGGTTTT 1560 

GCTCATGTGG AAGATCAAGC GAGTTTAGTA 1620 

AGAAGTTTTG AAAGGAAAAA AATGAAACTA 1680 

AGTCGTGGAA TTGGTCTTGC CATCGCCCAC 17 4 0 

TTAAACAGTC GTGGGGCAAT CTCAGAAGAA 18 00 

AAGGTGGTTC CCATTTCAGG AGATGTATCA 18 60 

CAAGCTATTG CAGAACTGGG TTCAGTAGAT 1920 
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GTTTTGGTCA ACAATGCAGG GATTACCCAA GATACTCTTA TGCTCAAGAT GACAGAAGCA 198 0 

GATTTTGAA 6 . AAGTGCTCAA GGTCAATCTG ACTGGTGCCT TTAATATGAC ACAATCAGTC 204 0 

TTGAAACCGA TGATGAAAGC CAGAGAAGGT GCTATCATTA ATATGTCTAG TGTTGTTGGT 2100 

TTGATGGGGA ATATTGGTCA AGCTAACTAT GCTGCTTCTA AGGCTGGCTT GATTGGCTTT 2160 

ACCAAGTCTG TGGCACGCGA GGTCGCTAGT CGGAATATAC GAGTCAATGT GATTGCTCCA 2220 

GGAATGATTG AGTCTGATAT GACAGCTATC TTATCAGATA AGATTAAGGA AGCTACACTA 22 8 0 

GCTCAGATTC C GAT GAAAGA ATTTGGGCAG GCAGAGCAGG TTGCAGATTT GACAGTATTT 234 0 

TTAGCAGGCC AAGATTATCT AACTGGTCAA GTGATTGCCA TTGATGGTGG CTTAAGTATG 2 4 00 

TAG C GAAAG C TAGAGGTGAA AAGAATGAAA CTAAATCGAG TAGTGGTAAC AGGTTATGGA 2 4 60 

GTAACATCTC CAATCGGAAA TACAC CAGAA GAATTTTGGA ATAGTTTAGC AACTGGGAAA 252 0 

ATCGGCATTG GTGGCATTAC AAAATTTGAT CATAGTGACT TTGATGTGCA TAATGCGGCA 2 58 0 

GAAATCCAAG ATTTTCCGTT CGATAAATAC TTTGTAAAAA AAGATACCAA CCGTTTTGAT 2 64 0 

AACTATTCTT TATATGCCTT GTATGCAGCC CAAGAGGCTG TAAACCAGCC AATCTTGATG 27 00 

TAGAGGCTCT TAATAGGGAT CGTTTTGGTG TTATCGTTGC ATCTGGTATT GGTGGAATCA 2 7 60 

AGGAAATTGA AGATCAGGTA CTTCGCCTTC ATGAAAAAGG ACCCAAACGT GTCAAACCAA 2 82 0 

TGACTCTTCC AAAAGCTTTA CCAAATATGG CTTCTGGGAA TGTAGCCATG CGTTTTGGTG 28 80 

CAAACGGTGT TTGTAAATCT ATCAATACTG CCTGCTCTTC AT CAAAT GAT GCGATTGGGG 2940 

ATGCCTTCCG CTCCATTAAG TTTGGTTTCC AAGATGTGAT GTTGGTGGGA GGAACAGAAG 3000 

CTTCTATCAC ACCTTTTGCC ATCGCTGGTT TCCAAGCCTT AACAGCTCTC TCTACTACAG 3 060 

AGGATCCAAC TCGTGCTTCG ATCCCATTTG ATAAGGATCG CAATGGGTTT GTTATGGGTG 3120 

AAGGTTCAGG GATGTTGGTT CTAGAAAGTC TTGAACACGC TGAAAAACGT GGAGCTACTA 3180 

TCCTGGCTGA AGTGGTTGGT T AC G GAAAT A CTTGTGATGC CTACCACATG ACTTCTCCAC 324 0 

AT C CAGAAG G TCAGGGAGCT ATCAAGGCCA TCAAACTAGC CTTGGAAGAA GCT GAGATTT 3300 

CTCCAGAGCA AGTAGCTATG TTAATGCTCA CGGAACGTCA ACTCCTGCCA ATGAAAAAGG 3 360 

AGAAAGTGGT GCTATCGTAG CTGTTCTTGG TAAGGAAGTA CCTGTATCAT CAACCAAGTC 3420 

TTTTACAGGA CATTTGCTGG GGGCTGCGGG TGCAGTAGAG CTATCGCACC ATCGAGCTAT 34 80 

GCGTCATACT TTGTACCATG CCAGCTGGGC AAGTGAGGTA T CA GAT AT AT CGAGCTAATG 354 0 

TCGTTATGGC AGGTTTGAGA AGAATTCATA CGTATTCAAA 3580 
(2! INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1780 base pairs 

(B) TYPE: nucleic acid 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(ivi ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

ATGTCCGCAA GAATGTGATT AATCAGCAAT CATCCTTGAT CGGAGATGAA TCGATCTTGG 6 0 

CTTTTGGAGT GAACCAGCCT TTTAGCGGAT TTGGTGTTAA AGGAGAAAGA CAG CAACAGC 12 0 

ATCAGCCTAT GACCTCTATG TACTGAAGCG ACCACTTCCC CAAGTAGGAC CTCGATGTCA 18 0 

TTTTAGATAG TCAAAATCAG GCTGTCTGCA TTGTCGAAAT TACAAAGGTT TCTGTTGAAC 24 0 

TCTTCAATCA AGTTTCTGCG CAACATGCCT TTAAGGAAGG TGAGGGAGAC AAATCACTTG 300 

CCTATTGGCG CCAGGTTCAT GAGGACTTTT TCACAGACTG TTTGGGTGAA GTAGGGCTGA 360 

CTTTTACACC TGAAAGCAAG GTTGTTTTAG AAGAATT T C G CAAGGTCTAC CCACTGTAGA 42 0 

CTATTAGAAG GAAGAAAGTT TTGGAAATCG CTGTCCAATC CTTTTTTCTC AAGCAAAATA 48 0 

TGATATAATA AGTTTGTTTG AAGAAGAGCA GCAGCTCTTA AACTTAGAAT AGGAGAAAAC 54 0 

TATGCAAGCA GTTGAACATT TTATTAAGCA ATTTGTTCCT GAACATTATG ATTTATTTTT 600 

AGATTTGAGT CGTGAGACCA AGACTTTTTC TGGGAAAGTG ACCATCACTG GTCAAGCACA 660 

GAGTGACCGC ATCTCCCTCC ACCAAAAAGA CTTGGAAAT C ACCTCTGTAG AAGTTGCAGG 720 

TCAAGCTCGT CCATTTACAG TTGACCATGA CAATGAAGCC CTTCATATCG AATTGGCTGA 78 0 

GGCTGGTCAA GTTGAATTGG TTCTTGCCTT TTCTGGTAAA ATTACAGACA ACATGACAGG 84 0 

GATTTACCCT TCTTATTATA CAGTTGATGG AGTCAAGAAG GAGGTCTTGT CTACTCAGTT 9 00 

CGAGAGCCAT TTTGCGCGCG AAGCTTTCCC ATGTGTGGAT GAGCCTGAAG CCAAAGCAAC 960 

TTTTGACCTC TCTCTTCGCT TTGACCAAGC AGAAGGT GAA TTGGCCTTGT CAAACATGCC 1020 

AGAAATCGAT GTTGAAAACC GTAAGGAAAC AGGTATCTGG AAGTTTGAGA CAACACCTCG 1080 

CATGTCTTCT TACTTGTTGG CCTTTGTTGC TGGTGATTTG CAAGGGGTGA CCGCTAAAAC 114 0 

TAAAAAT GGT ACCCTGGTAG GTGTCTACTC AACCAAAGCA CATCCACTTT CAAATCTTGA 1200 

TTTCTCACTG GATATCGCTG TTCGCTCTAT CGAGTTTTAC GAAGATTACT AT GGAGTTAA 1260 

GTACCCAATT CCTCAATCTC TCCACATCGC CCTTCCTGAC TTCTCAGCTG GTGCTATGGA 132 0 

AAACTGGGGT CTTGTGACCT ACCGTGAAGT TTACTTGGTT GTC GAT GAGA ACTCTACATT 1380 

TGCTAGCCGT CAACAAGTTG CCCTTGTTGT GGCCCATGAA TTGGCTCACC AATGGTTTGG 1440 
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GAACCTCGTG ACTATGAAAT GGTGGGATGA CCTTTGGCTC AATGAAAGTT TCGCTAATAT 15 00 

GATGGAATAC GTCTGTGTGG ATACCATCGA ACCAAGCTGG AATATCTTTG AAGATTTCCA 1560 

AACAGGTGGA GTACCTCTTG CTCTTGAACG TGACGCTACT GATGGCGTTC AGTCT GTCCA 162 0 

CGTCGAAGTT AAACATCCAG AT GA GAT CAA TACACTCTTT GACGGCGCTA TCGTCTATGC 1680 

AAGGAAGCGT CTCATGCACA TGCTTCGCGT TGCTAGAGAT GCTGATTTGT AAGGTTGCAC 174 0 

GCCTACTTTG GAAACACCAT ACAGCACACC ATTGGAGTGA 1780 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 671 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GTCTTTTGTA GCGAGGCCAG TGTCTTTTGC CCATCATTTG TCAGGCAGAT AAAACTAGAG 60 

CGTCTATCTT GATGGCAACA CATGCGACTG AGTAGACCGC AATTTTTAGC TTCCAAGCGA 12 0 

GCCACCATCC TAGAAACTGC GCTCGGGCTC AGATGAAGCT TATCTGGCAG GTCAATCTGG 18 0 

CGTAGAGATT TTTCTTCAGC CAAGTCCAGA TAGTAGAGCA GGTAGAACTC TTTCAAGGTC 24 0 

AGACTTTGCT CGCTCTGTTG GGCAATGGTC TCTTCCAAGA GACTTTCAAT TTCTTTCTGA 300 

CGCCGATTGA AGTCAAACCA TTTTTCCAAA TAGGT CATAG TGTCTCCTTT CTTTTTAGAG 360 

T CAT AAAAT A GAAGAAAGTC CATTAACGGG CAGTCTCTGC GTCACAAGAT GATTGCGCAT 420 

GCAATAATTA TACTACTTTT CAAGAATGCT GGCAAGCTCT GTTTTTTAGT GGTTTTCTTT 4 80 

TTTACTGTCT ATATTTTTGG TAAAAATCAA CTTTTACTTG GATGAAGGTT TTGGCTTCAC 540 

GTAGGAGTTG AAGAAGGGTG GCGCGGGTTT CAATTCTTCT CTTGTCTTGG GCAGACTGCG 600 

GTTCCGGAAG ACTTCCAGAT AACGTTCAAT TTCATCTAGC AATCAGAGCA GGATTGGTCT 660 

GGCTCAGTGA C 671 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1557 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 
(D! TOPOLOGY: linear 

(ii! MOLECULE TYPE : DNA [genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
TTTCAGCTCA CAAATATAGG TCGGATGAGC CACTTCCTTA CGAACACGCG CATCAAAAGC 
ATCTAGCTCC TCACGTGAAA AAGCATCCTG CAAACTATAA AGAGGATACT GATGACTGTA 
TTTTTCAAAA C CAT CT AAAA CCTTGCCACC AACACGATGA GTCGGACTGT CTGCTAGCAC 

TTGCTCTGGA TAAGCAGTTT CTAACTCGAC CAACTCACGG TAAAGGCGGT CATACTCACT 24 0 

GTCTGAAACC GAGGGATTAT CGCTGGTATA GTACTCAGTC GCATAGCGAT TGAGCAAAGC 30 0 

GACTAACTCA TTCATTCTTT TATTCATAAG AC CATTTTAC CAT AAAA CAA GCCCTCCTCA 3 60 

CAAACGAGAA GGGCGGAAAA AACACTTAGT TTGAAATTAT TTTTGAAACT CAAGCAACCT 42 0 

TATATCAATT TTTCAAAATG AGTTCGAACA TAAATAAACG ATATACAAGA CAAGATGATA 4 80 

ACACCACTTC CAATTATCAG GAAAGAAGAG AGATGTACAC TTGGCAAGAC TGT CATAAAT 54 0 

CCTTTTGCAA TAGGCATAAA TAGAATAGCT AAGGTAAAAA TTGTACTCAG TACTCTTCCA 600 

AGAAATTCGC TCTCAACCTT GGTTTGTACT TGAGTAAAAA AGTGAATATT AAAAATCGTC 660 

ATAAACAATT CACAAACTAA ATTTCCAGAA AAGGAAAGAA AAGTTGGAAG TGGTAATCCC 720 

AT C ATAAAAA CTCCGACACC TGTCAAAGCC AGTAAAATCA AAAG AT TATA AATATTAGCT 7 80 

TTAATTTTAC TAGCTAGAAG AGCCCCAATG ATGGAACCAA TAGCCCCCAT AGTTAAAATA 8 40 
CTTGCATAGG CTCCTT CTGA CCCGTAAAGC TGATTCGAAA AGGGAAGTAG AAATTCAAAA 
GCTGCAAAAA AGAAATTAAC GCTGGAAGCT ACCAGCAAAA GGAAGAAAAT TTCTTGCTGA 
TGCCAGATAT AGTGTAACCC ATCCTTGATA TCTACAAAAA TATCTCTCCC AGTAAAAGCC 
TTTTTCTCTT GAACTTTTGC TTCCTCTTTT GGAAGGAAAG CCACTAGAAC AAAAGCAATG 

AAAAAAGTCA GCGAGTCTAG CAGTAGCGTC ATATGGAGAC TTGCAAACTG TAAAACAAGG 1140 

AAGGAAAGAA CAGGAGAGCT AACACCTACA ACCTGCAAAA CCAGCTCTAA GCGAGAATTA 1200 

TAGATCACAA TCTCATTTTT CTCCACCACT TCAGTTATGA TAG CTTTATT GGCTGTGCGA 1260 

GAAAAGGCAA AAGCAATAGC CTGCACAATG TTAGCAACAA TCAAAGCGCC AATCATCCAG 1320 

CTATCATTCC TTATGAAAGA AATAGCCAGA CAAAGAATCC CACAAACAAG ATCTGCCGTC 138 0 

ATTAAAATCT TACGACGAGA AAAACGGTCT GAAATAACTC CGCCAAAGGG ATTGACGAGA 14 4 0 
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ATAGATGTGA CGAGCTCAGA AATCTGATAC ATTCCTAAAA CTGTCTGTCC TATAGTCCCC 15 0 0 

ATAGAAGCCA ACCAGACACT ATTTCCATAA TCATAGAGCA TATTCCCATT TTATTGA 15 57 

5 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 658 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

2 5 CTTATTTGGT TTGGGAATTC GTCATGTCGG AAGCAAGGCT AGTCAGCTTT TACTTCAATA 60 

TTTCCATTCA ATTGAAAATC TGTATCAGGC AGATTCAGAG GAAGTGGCTA GTATTGAAAG 12 0 

TCTAGGTGGC GTGATTGCCA AAAGTCTTCA GACTTATTTT GCGGCAGAAG GCTCTGAAAT 18 0 

30 

TCTGCTCAGA GAATTGAAAG AAACTGGGGT CAATCTGGAC TATAAAGGAC AGACGGTAGT 240 

AGCGGATGCG GCCTTGTCAG GTTTGACCGT GGT ATT GAGA GGAAAATTGG AACGACTCAA 300 

3 5 GCGCTCAGAA GCTAAAAGTA AACTCGAAAG TCTGGGTGCC AAAGT GACAG GTAGTGTTTC 360 

TAAAAAGACC GACCTCGTCG TGGTAGGTGC AGACGCTGGA AGTAAACTGC AAAAAG C AC A 420 

AGAACTTGGT ATCCAGGTCA GAGATGAGGC ATGGCTAGAA AGTTTGTAAT GGATCGTTTA 480 

40 

AAAACAGAGT TTAGAGAATA TGACTATGTC TGTTAATTGA GACGAGATTG ACAAAAATTT 540 

ATTAGTGAAA TAGGAAACAA AGTAAAAAGG AAAAATAAAA AATGTATACT ACCCTATGCG 600 

4 5 CATTCATTAC CATCGTAAGA AT G GAG AAT A TGACCTTGCT CCTTTGTAAA AGTCAGGA 658 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 2474 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 8 : 

5 ACAATCGATC AGACAGTCAA TCGATTTCTA AAATGTTTAG AGTAGAGATG TACCTATTCT 60 

AGTTCAATAT ACTATATAAC TGAAAATTTA GATAAATTAG TTTTGGAAAT GACTAACCAA 12 0 

AGATATCCAA A GTAGT CTAA AATTGTCTAT ACTTTATGAG TGTTTTAGTT AG GAAAAAG G 180 

10 

CTT GTTGTCT ATAATTGG C G CATTAGTCTA GATTTTATTT AT AG AAAAT G TTATAATAGA 24 0 

CTGTATTTAA AAAATTTTAA GGAGAAATGA CAGAATGTCT GTATCATTTG AAAACAAAGA 30 0 

15 AACAAACCGT GGTGTCTTGA CTTTCACTAT CTCTCAAGAC CAAATCAAAC CAGAATTGGA 360 

CCGT GTCTTC AAGTCAGTGA AGAAATCTCT TAATGTTCCA GGTTTCCGTA AAGGTCACCT 42 0 

TCCACGCCCT ATCTTCGACC AAAAATTTGG TGAAGAAGCT CTTTATCAAG ATGCAATGAA 4 80 

20 

CGCACTTTTG CCAAACGCTT ATGAAGCAGC TGTAAAAGAA GCTGGTCTTG AAGTGGTTGC 54 0 

CCAACCAAAA ATTGACGTAA CTTCAATGGA AAAAGGTCAA GACTGGGTTA TCACTGCTGA 600 

2 5 AGTCGTTACA AAACCTGAAG TAAAATTGGG TGACTACAAA AACCTTGAAG TATCAGTTGA 660 

TGTAGAAAAA GAAGTAACTG ACGCTGATGT CGAAGAGCGT ATCGAACGCG AACGCAACAA 72 0 

CCTGGCTGAA TTGGTTATCA AGGAAGCTGC TGCTGAAAAC GGCGACACTG TTGTGATCGA 780 

30 

CTTCGTTGGT TCTATCGACG GTGTTGAATT TGACGGTGGA AAAGGTGAAA ACTTCTCACT 840 

TGGACTTGGT TCAGGTCAAT TCATCCCTGG TTTCGAAGAC CAATTGGTAG GTCACTCAGC 900 

3 5 TGGCGAAACC GTTGATGTTA TCGTAACATT CCCAGAAGAC TACCAAGCAG AAGACCTTGC 9 60 

AG GTAAAGAA GCTAAATTCG T GACAAC TAT CCACGAAGTA AAAGCTAAAG AAGTTCCAGC 102 0 

TCTTGACGAT GAACTTGCAA AAGACATTGA TGAAGAAGTT GAAACACTTG CTGACTTGAA 108 0 

40 

AGAAAAATAC CGCAAAGAAT TGGCTGCTGC TAAAGAAGAA ACTTACAAAG ATGCAGTTGA 114 0 

AGGTGCAGCA ATTGATACAG CTGTAGAAAA CGCTGAAATC GTAGAACTTC CAGAAGAAAT 12 00 

45 GATCCATGAA GAAGTTCACC GTT CAGTAAA TGAATTCCTT GGGAACTTGC AACGTCAAGG 12 60 

GATCAACCCT GACAT GTACT TCCAAATCAC TGGAACTACT CAAGAAG AC C TTCACAACCA 1320 

ATACCAAGCA GAAGCTGAGT CACGTACTAA GACTAACCTT GTTATCGAAG CAGTTGCCAA 13 8 0 

50 

AGCTGAAGGA TTTGATGCTT CAGAAGAAGA AATACAAAAA GAAGTTGAGC AATTGGCAGC 14 4 0 

AGACTACAAC AT G GAAGTT G CACAAGTTCA AAACTTGCTT TCAGCTGACA TGTTGAAACA 1500 

55 TGATATCACT ATCAAAAAAG CTGTTGAATT GAT CACAAG C ACAGCAACAG TAAAATAAT C 1560 

TTAATAAACA GAAAACCCAC CTGAATTGGT GGGTTTTCTG ATGCACTATT TTCCAAAAAT 162 0 

CTCTTTGAGG TCTGTGTCTG TAATCCCAAT CATGGCTGGG ATGCGGTCCC AGTTTTCTTC 1680 

60 

GGTTAGGATG TAGGATTGTT CAGAGGCACT TGATGTGACT GTTTCAGAGA CAGCTTGTTG 17 4 0 
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CTTTTCTTCA ACATTCTCCA GTAGATCACT GAAGCGTTCA AT CAGATAGG TTTTTCGGGC 1800 

AGTTCCGATG TGTTGGGTAG CATAGTCGAA GGCTTGTAAT TCGCCTAGTA AGATGAGTTT 1860 

GCTTTTGGCA CGTGTAATGG CTGTGTAGAT GAGATTTCGC TCCAG CATAC GTCGGCTAGC 192 0 

ACTAGTAATC GGTAGGATGA CAACTGGGAA CTCACTTCCC TGAGACTTAT GAATACTCAT 198 0 

GGCATAGGCC AAGCGAATCT TGTACCATTC GTTACGGGGG TAAGAGACTT CATTACTATC 2 04 0 

AAAATCAATG ACAATCTCGT CTTGTTTCGA TTCGGTGTAT TTACCAGGAA TCAGGTCTGT 2100 

GATAGCTCCT AAATCCCCAT TAAAGACATT GATTTCAGCA TCGTTAACCA AATGAATGAC 2160 

CTTGTCTCTC TTACGATAGT GACACTGAGG AGCTTCAAAA CTGAGTTGAT CTTTTTGTGG 222 0 

GGGATTGAGC AGGTCTTGCA TGAGCTGATT GAT AG CAT CA ATCCCTGCCG TCCCTCGGTA 22 8 0 

CATAGGAGCC AGAACTTGGA TATCACGGGC GGGAATAC CA TTTCTGAGGG CGGCACCTAA 234 0 

GATTTTTTCA ATGGTGGCAG GAATATGGCC ACTAGCAATT TCAAAGTAGG AACGGTCAGC 24 00 

TTTTTTTTGG GTGAAATCAG CTGGCAAGAT GCCCTGTCGA ATCTGACTAG CTAGGGTGAC 24 60 

GATGGTTGAT TCTT 247 4 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(lv) ANTI-SENSE : NO 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AT CAAAAT T A ACGTATTCTT TTTGAAGTTC AAGAACTTCT TCCATTGTTG AGCATTCTGT 60 

AAGGGCACGG TTTGCGTACT CTTCCATCTT AGCTGTGTCG AGTTTCTTCA TCAAGCTGCG 120 

50 

TGTACGAAGT ACAGATGTTG CTGACATAGA GAACTCATCC AAGCCCATTC CGACAAGAAG 180 

TGGAACAGCT TGTTGGTCAC CAGCCATCTC ACCACACATA CCAGCCCATT TACCTTCAGC 24 0 

5 5 GTGAGCTGCT TTGATCACAT TGTTAATCAA GCGTAGGATT GATGGGTTGT ATGGTTGGTA 300 

AAGGTATGAA ACTTGTTCGT TCATACGGTC TGCTGCCATT GTATATTGGA TCAAGTCGTT 360 

TGTACCAATT GAGAAGAAGT CAACTTCTTT AGCAAATTGG TCTGCAAGCA 7AGCCGCTGC 42 0 

60 

AGGAATCTCG AT CAT GAT AC CAACTTGAAT GTTATCCGCA ACTGCAACAC CTTCAGCAAG 4 80 
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AAGGTTTGCT TTTTCTTCAT CAAAGACTGC TTTCGCTGCA CGGAATTCTT TCAAGAGCGC 54 0 

AACCATTGGG AACATGATAC GCAATTGACC GTGAACAGAC GCACGAAGAA GAGCACGGAT 60 0 

TTGTGTGCGG AACATAGCAT CTCCAGTCTC AGAGATAGAG ATACGAAGAG CACGGAATCC 660 

AAGGAATGGG TCATTCGTGA GGCATATCGA AGTAAGGAAG TCCTTATCTC CACCGA 716 
(2! INFORMATION FOR SEQ ID NO: 50: 

(i! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 962 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: 

AGTAACCTAA ATCAATTATG GTGTTATGAG TCTTGGTGTG CCCAAAGTGC TGACGTAACT 60 

ATCTCAGCTG AAGGTGCAGA TGCAGATGGC CTATCGCTGC AATCTCAGAA AC AAT G G AAA 12 0 

AAGAAG GAT T GGCATAAGGG AAAT GACAGA AATGCTTAAA GGAATCGCAG CATCTGACGG 18 0 

TGTTGCAGTT GCAAAAGCAT ATCTACTCGT TCAGCCGGAT TTGTCATTTG AGACTATTAC 240 

AGTCGAAGAT ACAAACGCAG AAGAAGCTCG CCTTGATGCC GCTCTACAGG CATCACAAGA 300 

CGAGCTTTCT GTTATTCGCG AGAAAGCAGT AGGTACGCTC GGTGAAGAAG CAGCTCAAGT 3 60 

TTTTGATGCT CACTTAATGG TTCTTGCTGA CCCAGAAATG ATCAGCCAAA TCAAGGAAAC 420 

TATCCGTGCG AAGAAAGT GA ATGCAGAAGC AGGTCTGAAA GAAGT TACAG ATATGTTTAT 4 80 

CACTATCTTT GAAGGCATGG AAGACAACCC ATACAT GCAA GAACGCGCAC GGATATCCGC 54 0 

GACGTGACAA AACGTGTATT GGCAAACCTT CTTGGTAAAA AATTGCCAAA CCCAGCTTCT 600 

ATCAATGAAG AAGTGATTGT GATTGCGCAT GACTTGACTC CTTCAGATAC AGCTCAATTG 660 

GACAAAAACT TTGTAAAAGC TTTTGTAACC AACATTGGTG GACGTACAAG CCACTCAGCT 72 0 

AT CATGGCAC GTACACTTGA AATTGCTGCT GTATTAGGTA CAAACAACAT CACTGAAATC 78 0 

GTTAAAGACG GTGACATCCT TGCTGTTAAC GGGATCACTG GA GAAGT GAT TATCAACCCA 840 

ACAGATGAAC AAGCGGCAGA ATTTAAAGCA GCTGGTGAAG CCTATGCGAA CAAAAAGCTG 9 00 

AATGGGCACT TTTGAAAGAT GCTCAACAGT GACTGCTGAC GGTAACACTC GAGTTGGCTG 9 60 

CC 962 
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(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

20 

GATCGTTTCC GTGGCTTGAT CGGAAGCATG TTTGACGAAT AAAG AG G AAA AATAAATTAT 60 

GACATTTTCA TTTGATACAG CTGCTGCTCA AGGGGCAGTG AT T AAAG T AA TTGGTGTCGG 12 0 

2 5 TGGAGGTGGT GGCAATGCCA TCAACCGTAT GGTCGACGAA GGTGTTACAG GCGTAGAATT 180 

TATCGCAGCA AAC ACAGAT G TACAAGCATT GAGTAGTACA AAAGCTGAGA CTGTTATTCA 24 0 

GTTGGGACCT AAATTGACTC GTGGTTTGGG TGCAGGAGGT CAACCTGAGG TTGGTCGTAA 3 00 

30 

AGCCGCTGAA GAAAGCGAAG AAACACTGAC GGAAGCTATT AGTGGTGCCG ATATGGTCTT 360 

CATCACTGCT GGTATGGGAG GAGGCTCTGG AACTGGAGCT GCTCCTGTTA TTGCTCGTAT 42 0 

3 5 CGCCAAAGAT TTAGGTGCGC TTACAGTTGG TGTTGTAACA CGTCCCTTTG GTTTTGAAGG 4 80 

AAGTAAGCGT GGACAATTTG CTGTAGAAGG AATCAATCAA CTTCGTGAGC AT GTAGACAC 54 0 

TCTATTGATT ATCTCAAACA ACAATTTGCT TGAAATTGTT GATAAGAAAA CACCGCTTTT 600 

40 

GGAGGCTCTT AGCGAAGCGG ATAACGTTCT TCGTCAAGGT GTTCAAGGGA TTACCGATTT 660 

GATTACCAAT CCAGGATTGA TTAACCTTGA CTTTGCCGAT GTGAAAACGG TAATGGCAAA 72 0 

45 CAAAGGGAAT GCTCTTATGG GTATTGGTAT CGGTAGTGGA GAAGAACGTG TGGTAGAAGC 7 80 

GGCACGTAAG GCAATCTATT CACCACTTCT TGAAACAACT ATTGACGGTG CTGAGGATGT 840 

TATCGTCAAC GTTACTGGTG GTCTTGACTT AACCTTGATT GAGGCAGAAG AGGCTTCACA 9 00 

50 

AATTGTGAAC CAGGCAGCAG GTCAAGGAGT GAACATCTGG CTCGGTACTT CAATTGATGA 9 60 

AAGTATGCGT GATGAAATTC GTGTAACAGT TGTCGCAACG GGTGTTCGTC AAGACCGCGT 102 0 

55 AGAAAAGGTT GTGGCTCCAC AAGCTAGATC ACCGCGCCTA GGATAACAAT TTTAGCAATC 108 0 

AAGATAAACC AAAACAT CAT AACAACAAGA AGAAC GGAAC CTAAAATTCG GACATCCACC 114 0 

AAATGATGGA CATAGTAATT GAGATAACTA GAGAACAGAG TTAGTACACC TAAAATCACC 12 00 

60 

AAGAGAACAA AGGCACTGCC TGGTAGGGTA TAGCTAATTT TCCTGTTAGA TAGATTGGGA 12 60 
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AGAAAATAAT AAAGCATGAC CAAGATAGCA AAGAGGAGGG CGTAAATCAG AGGACCTGCC 132 0 

AACCCTTGTA AAGCCTGATA GATAATGCCA TCTTTTGTCC AATAATGAGC AAGTAAAGCC 138 0 

5 

AAAATCATCT GACCAAATAA GATCAAAAAC AAGGCAAACG CAAAGAGGAA CTGCAAGCCA 14 4 0 

AAACTGACTA GGAGACTTAG CATCTGATGG GAAATAAGTC CACGACTCTT TTCGACGCCA 15 00 

10 TAAGCCTTGT TAAAAGCTTT TTGCAAGAAA TTTATAGATT TTGAAAAACT CCATAACGCC 1560 

GATAAAACAG AAAAACTCAA TAAACCTGTT GAAGGTTGCG TCAAAGACTT CTCTGGCTAT 162 0 

TTTTTCCACA CCTTCATAGA GGCTTGGGGG CAGGACGTCT TTCATAAAGC CCAGAAATTC 16 8 0 

15 

TCCCACAGGA ATCTGAAAAT AGGGGAGGAT ATTGACCACC ACCAAAAGCA GGGGGAAAAT 174 0 

CGAAATCAAC CAATAGTACG CTACTGCGAC ACT G GT CAAA CTCACTATCT GATGCTTGAT 18 00 

2 0 AATAATGCAA AAAAGCTTTT AATAAAGGCT TGTCTATCAG CTCTTTCCAC CACTTTTTCA 18 60 

TGTCATACTC CTTCATTTAT AATCTTATAC TCAATGAAAA TCAAAGAGCA AACTAGAAAG 192 0 

CTAGCCGCAA GCTGCTCAAA ACACTGTTTT GAGGTTGTAG ATAAGACTGA CGAAGTCAGT 1980 

25 

CACATACATA CGGTAAGGCG ACGCTGACGT GGTTTGAAGA GATTTTCGAA GAGTATTAAC 2 04 0 

TAATTTCTTC TTACCAATTC C AC CAT AT C A TACGGTAGGG TATTGGCAGC TTCCTTCAAG 2100 

3 0 GAATAGTTCT CTAAGTTATT TACATTTTGT CGTAATTTCT TGGCATACTT AGTTGTAATT 2160 

AATCGTTTTT CTTCGTATTC GAAAATCAAC TTGCGCTCCA GATAATAGCC TCTCAGCATT 222 0 



T CATT GAT AT TGTTGGGTTT GACACGATTG ATAACCCGTT CGACAAAGGC ACCACTGCTG 2280 

35 

ATAATAGTTG TTTCTCGAAG ACGAGACTCC T G CAT AAAAC TAAT CAAAG A GCGTCTGTAG 234 0 

ACTCCCTTCA GGTTTTCCAA ACTTTCAATA ATCATCTCCG TATTGGCAAG ATAGAGCTCT 2 4 00 

4 0 GCAATTTGGT CATAATCAAG AG CAC G GAGA CGGCTTTGCT CCTTGTCCTT CCAGCTACGG 24 60 

AAGGTCTTTC CAAGAGTAAA AACTTCATGA AGGAGAAAAC GTAAAATCCT CAAGGAAACA 2 52 0 

AGAAAATAAT AGGTCAGTCT TGAGGCAAGT TTACGATTGA TTCCTTGTTC TATATTTTTC 25 8 0 

45 

AGATAACGTT GGTAAACTCG GTAAG CACGA TTGCTAATGT TCCCCTCTTC ATAGGCCTGT 2 64 0 

TCCAAACCAT CACTTTCAAT ACTAAGAATC AAGAGTTTCA AAGCAGCCCA GTCTTCTTGA 2 7 00 

50 TC 2702 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS : 
55 (A) LENGTH: 6217 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

60 (ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL: NO 
i'iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

GAATTCCAAG AAGCTAGCCA AGAAAGTCGT GAACGTAGTG ATCCGCTAAA TAGTTATCTC 60 

CTTTTGTCAG GCTCCTTGAC GAAAGAAAAG CTTGCCGATA AATTAGGAGA TTTGGGTTAT 12 0 

AAAGCAAGTG CTGACCGAAA GATACCGCCC TATTTTCTTG CTTTTCGAAT ATTACTAAAT 180 

CCCCTTATTT TAATTAGTTT AGCAATATTT GGCTTATCTT TCTTTGCTTT AGTGATTATC 2 40 

ACTCGGATTA AG GAAAT GAG AGCAG CAGGT ATAAAACTCT TTTCTGGTCA GACTCTCTTA 300 

TCCATCATGG GGCATTCTTT ATCTACTGAT ATCAAATGGC TCCTTCTATC AGCCCTCCTT 3 60 

TCCTTCCTAG GTGGGGGTGT CGTTCTTTTT AGTCAAGGTT TGTTTTATCC TATCTTGTTA 42 0 

GCCACCTATG GTTTTG GGAT TAGTTTCTAT CTGTTGTTTT TATTGGCGAT TTCAATTTTA 4 80 

CTAATGCTTC TTTATCTAAT GAGTTTGAAT ACAAAGCATT AGTTCCCGTT ATTAGGGGGA 54 0 

GATTCCCCCT GAACCCTGAT GAATAACCCA TTGTTTCAGT AGACCTGTTT TTTCAGTAGG 600 

ATACGCTTTA AGACAGGTTG ACGTCTTACC AACGATTGAA AGAACTTGAA ATTTCAGACA 660 

AGATGGCAGG ATAGAGTAGA CTATTATCAC GATTTCTTTT GACTTAGGTT ATAGAGGTTG 72 0 

AGATTCAGAA AAT CAGAGCA AGTGGTATGC CTTTACCAAG GGAGCAGCGA AGAAGAACAA 7 80 

GCTCTTTATG TAAAGGATAA TCTGCTCCAT TTTGCCAATC CACAAGGAAA AAAT GAAC AG 84 0 

GGAGAGACAC TGGATACCTA TAGTCCAGAT GCTAATACGC TCTATGTTAG TCCCAGTTAT 900 

TTGGACAAGG AAAAGGTCGT GGTAGATGCT GAGACCAAAC AGAAGTTAGC CCATCTCCAA 960 

AAAGGTGAGT TTATCCTCTT GCTCCCAGAA CATTT GCGCT CTCGAGAAGC AGAACTTAAG 1020 

AAAGTTTTTG AAGAAAGATT GAGTTATTAT GGAAAATCTG GTGAGGAGGC AAGTGCTCCT 1080 

TT GGATTAT G AGATGAAAGC GCACGTTAGT TATCTTTCAA TGGGAGAAAA GCGGTTTGTT 114 0 

TATAATAACG GTGAGAATCC CGTATCTACT CAGTATTTGA CTGATCCGAT TTTAGTTGTA 1200 

TTCACGCCGA CTTCTACAGG TGATAGTTTT ATATCCTTAT CTAGTTGGTC TATCAATGCT 1260 

GGAAAACAAC TCTTTATCAA AGGATATGAG AGTGGGCTAG AACTCTTGAA GAAAGCTGGA 1320 

ATTTATGAGC AAGTATCCTA TCTTAAAGAA GGAAGAAGTG TTTATCTAAC TCGTTATAAT 1380 

GAAGTTCAAA CTGAAACAGC AACTTTAATC TTAGGAGCTA TTGTGGGGAT AGCTAGTTCC 14 4 0 

TTGTTACTCT TTTATTCTGT CAATCTTCTA TATTTCGAGC AATTCCGCCG AGATATCTTG 1500 

ATTAAACGAA TTTCAGGTTT ACGATTTTTT GAAACACATG CTCAGTATAT GGTTAGTCAA 1560 
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TTTGCCAGTT TTGTATTTGG TGCTAGTCTC 
GGCTTGCTCA CTTTATTAGT CTTTCTAGCT 
5 CAGAZvAGAAT CTCGTGTTTC TATGACAATT 
GAATATATCT AAAWVTTGG AAGCCGTCAG 
GGTGGGAAAA TTTATGCCTT AATCGGTACA 

10 

ATGATTGGAC GATTGGCGCC AT AT GA C AAA 
AAGGACATCA AGCCTTCTGT TTTCTTTAGA 
15 GGCTTAATTG AAAGCCAAAC CGTCAAAGAG 
TTGAAGGAAA AAGAGAAAAT CTCTTTGATG 
TATTTGGATT TGAAGCAACC TATATTTGAG 

20 

CTAGCGAAGA TAATTTTAAA GGATCCGCCT 
TTAGACCCCA AAAATTCTGA GGAATTACTT 

2 5 C GGAC CATTA TTATTGCGAC CCACAATCCT 

CGAGTTACCG ATTTATCACA TAGATGATAT 
AACACACTTT GTGGCTTTTT TATTTCCATA 

30 

GTTCGAGACA TGAAAGTAAT A GAT CAATTT 
GCCAAGTCTG GTGAATCTGC AGCTCGTTTG 

3 5 AATGATGGGA AACCTTTCGA GGACAATCCA 

AAGGT CATTA CAGGTGGCCA TCCTTTGGAA 
AAAAAT C C AG GTATCCCCTA CAACAATCCC 

40 

CCAGTCTTGA CTGAGGTGGA ATTGGCTTAT 
ACAGGATCGA AC GGTAAGAC AACCACAACG 
45 GGGCAACATG GTCTTTTATC AGGGAATATC 
GCATCAGATA AGGACACGCT TGTTATGGAA 
GAATTCCATC CAGAGATTGC GGTTATTACC 

50 

GGGTCATTTT CTGAATATGT AGCAGCCAAG 
GATTTCCTTG TCTTGAACTT TAATCAAGAC 
55 GCCACTGTTG TACCATTTTC AACACTTGAA 
CAACTCTACT TCCGTGGTGA AGTAGTCATG 
CACAATGTGG AAAATGCCCT TGCGACTATT 

60 

CAAAC CAT C A AGGAAACTCT TTCAGCCTTC 



-109- 

TTTATTTTAA GCAGTCGAGA CTTGGTGATT 162 0 

AGTGCAGTTT TGACGCTTTA CCGTCAAGCG 168 0 

ATGAAAGGAA AATAGGATGA TTGAACTAAA 174 0 

CTATTTTCAG ATACGAATCT TCATTTTGAA 1800 

AGTGGCTGTG GTAAGACAAC ACTCTCGAAT 186 0 

GGGCAAATCA TCTATGATGG CACTTCTCTT 192 0 

GATTACTTAG GATACTTATT TCAAGATTTT 1980 

AATCTCAATC TGGGTTTAGT TGGTAAAAAG 2 04 0 

AAACAAG C T C TAAACCGTGT AAACCTCTCT 2100 

TTATCAGGAG GAGAAGCACA ACGTGTTGCA 2160 

TTGATTCTTG CAGATGAACC AACCGCTTCC 2220 

TCCATCCTAG AATCTTTAAA AAATCCGAAT 2280 

CTGATTTGGG AGCAAGTGGA TCAGGTCATT 2340 

GGTAAGATTC AGTTAGAAGA AAGAGT CACA 2 4 00 

AAAATGGTAA AATAGTAGGA GTAGAAATGA 2 4 60 

AAAAAT AAGA AAGTTCTTGT TTTAGGTTTG 2 52 0 

TTGGACAAGC TAGGTGCCAT TGTGACAGTA 2 580 

GCTGCCCAAA GTTTGCTGGA AGAAGGGATC 2 64 0 

CTCTTGGATG AAGAGTTTGC CCTTATGGTG 27 00 

ATGATTGAAA AGGCTTTGGC CAAGAGAATT 2760 

TTGATTTCAG AAGCACCGAT TATTGGTATC 2820 

ACTATGATTG GGGAAGTTTT GACTGCTGCT 2 88 0 

GGCTATCCTG CCAGTCAGGT TGCTCAAATA 294 0 

CTTTCTTCTT TCCAACTCAT GGGTGTTCAA 3000 

AACCTCATGC CAACTCATAT CGACTACCAT 3060 

TGGAATATCC AGAACAAGAT GACAGCAGCT 3120 

TTGGCAAAAG ACTTGACTTC CAAGACAGAA 318 0 

AAGGTTGATG GAGCTTATCT GGAAGATGGT 32 40 

GCAGCGAATG AAATCGGTGT TCCAGGTAGC 3300 

GCTGTAGCCA AGCTTCGTGA TGTGGACAAT 3360 

GGTGGTGTCA AACACCGTCT CCAGTTTGTG 3 42 0 
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GATGACATCA AGGGTGTTAA ATTCTATAAC 
CAAAAAGCCT TATCAGGATT TGACAACAGC 

5 

CGTGGCAATG AGTTTGACGA ATTGGTGCCA 
CTGGGTCAAT CTGCAGAACG TGTCAAACGG 
10 GAGGCGACAG AT ATT GCAGA TGCGACCCGC 
GTGGTTCTTC TTAGTCCTGC CAATGCCAGC 
GGCGACCTCT TTATCGACAC AGTAGCGGAG 

15 

TTTACAGGTG GGGGGACGGT TGGACACGTT 
ATCGAAGATG GTTGGGAAGT C CACTATAT C 

2 0 AT CCTTAAGT CAGGTTTGGA TGTCACTTTC 

TATTTCTCTT GGCAAAATAT GCTGGACGTC 
CTCTTTATCA TGTTGCGACT TCGTCCACAG 

25 

GTACCGCCTG TTATCGCAGC GCGTGTGTCA 
CTGTCTATGG GCTTGGCCAA TAAAATCGCC 

3 0 TTTGAGCAAG CTTCGAGTTT GTCTAAGGTT 

GATCAAAAAA ATCCAGAACC AGATGAATTG 
TTGCCGACTG TATTGTTTGT TGGCGGTTCT 

35 

ACAGACCATA AGAAAGAACT AACAGAGCGC 
AGT CTGAACG AGTTGAGCCA AAATCTTTTT 

4 0 CCCTTGATGG AATTGGCTGA TATTGTTGTG 

CTCTTGGCGA TAGCAAAATT GCATGTCATT 
GACCAGATTG AAAATGCAGC TTACTTTGTA 

45 

AGCGATTTGA CCTTGGATAG TTTGGAAGAG 
GATTACCAAG CTAAGATGAA AGCTTCTAAG 
50 TTGTTGAAAA AAGATTTATC ATAAGGAAAG 
AAAGAAACCC TCGAAGAATT GAAAGAGTTA 
CTAAAAAAGA AGGCTGAAGA AGAGGTGGCT 

55 

GCTCGAATGG GAGAAGAATC TGAGAAGTCA 
GACCAGGAAG ATTCAGAATC AGCTAAGGAA 
60 GCTGACAAAG AGAAAGAAGA ACCAGAGTCT 



GACAGTAAAT CAACTAATAT CTTGGCTACT 34 8 0 

AAGGTCGTCT TGATTGCAGG TGGTTTGGAC 354 0 

GACATTACTG GACTCAAGAA GATGGTCATC 3600 

G CAG C AG AC A AGGCTGGTGT CGCTTATGTG 3660 

AAGGCCTATG AGCTTGCGAC TCAAGGAGAT 37 2 0 

TGGGATATGT ATGCTAACTT TGAAGTACGT 37 8 0 

TTAAAAGAAT AAAATATGAA AAAAATTGTC 384 0 

ACCCTCAATC TTTTGTTAAT GCCCAAGTTC 3900 

GGGGACAAGC GTGGTATCGA ACACCAAGAA 3960 

CACTCCATTG CGACTGGGAA ATTGCGTCGC 4 02 0 

TTCAAAGTTG GCTGGGGAAT CGTCCAATCG 4080 

ACCCTTTTTT CAAAGGGGGG CTTTGTCTCA 4140 

GGAGTGCCTG TCTTTATTCA CGAATCTGAC 42 00 

TATAAATTTG CGACTAAGAT GTATTCAACC 42 60 

GAGCATGTGG GAGCAGTGAC CAAGGTTTCA 4320 

GTGGATATTC AAACCCACTT T AAT CATAAA 438 0 

GCAGGTGCTC GTGTCTTTAA CCAATTGGTG 44 40 

TACAATATTA TCAATCTAAC TGGAGATTCT 4500 

CGTGTTGACT AT GTGACCGA TCTCTATCAA 4 560 

ACACGAGGTG GTGCCAATAC GATTTTTGAG 4 620 

GTGCCGCTTG GTCGTGAAGC TAGTCGTGGT 4680 

AAAAAAGGCT AT G CAGAAGA CCTTCAAGAA 474 0 

AAGCTTTCTC ACTTACTAAG TCACAAGGAA 4 800 

GAATT GAAAT CT C TAG CAGA TTTTTATCAA 48 60 

TAAATGTCAA AAGATAAGAA AAATGAGGAC 4 920 

TCAGAATGGC AG AAAC G AAA CCAAGAATAT 4 980 

CTAGCTGAGG AGAAGGAAAA GGAAAGACAA 504 0 

GAGGACAAAC AGGACCAGGA GAGTGAAACA 5100 

GAGTCTGAAG AAAAAGT AG C ATCCTCAGAG 5160 

AAAGAGAAGG AGGAACAGGA TAAAAAGCTT 5220 
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GCTAAAAAGG CTACAAAGGA AAAACCAGCC AAAGCAAAGA TTCCTGGTAT CCATATCTTG 52 8 0 

CGAGCC7TCA CGATTTTATT TCCAAGTCTG CTTTTATTGA TTGTCTCTGC CTACTTGCTC 534 0 

AGTCCTTATG CGACCATGAA AGATATTCGT GTTGAGGGAA CGGTGCAAAC TACAGCTGAT 54 00 

GATATTCGAC AGGCTTCAGG CATTCAGGAT TCGGATTATA CGATTAACCT TCTGCTAGAC 54 60 

AAGGCAAAAT ATGAAAAGCA GATTAAGTCT AACTATTGGG TTGAATCAGC TCAACTTGTC 552 0 

TATCAATTTC CAACTAAGTT CACTATTAAG GTCAAGGAAT ATGATATTGT GGCCTACTAT 55 8 0 

ATTTCTGGTG AAAAT CATTA TCCTATTCTT TCCAGTGGTC AGCTTGAGAC TAGTTCTGTG 5 64 0 

AGTCTGAACA GTTTACCAGA AACTTATTTA TCAGTTCTCT TTAATGATAG TGAACAAATC 57 00 

AAGGTTTTTG TCTCAGAACT TGCTCAAATT AGCCCAGAAC TCAAGGCGGC TATCCAAAAG 57 60 

GTGGAATTAG CCCCAAGCAA GGTGACATCC GATTTAATTC GATTGACCAT GAATGATTCG 582 0 

GACGAAGTCT TGGTTCCTCT ATCTGAAATG AGTAAGAAAC TGCCATATTA CAGTAAGATT 58 8 0 

AAGCCACAAT TGTCAGAACC GAGTGTGGTC GACATGGAAG CTGGAATTTA CAGTTACACT 594 0 

GTGGCGGATA AATTAATTAT GGAGGCTGAG GAAAAAGCCA AACAAGAGGC CAAGGAAGCT 60 00 

GAGAAAAAAC AAGAAGAAGA ACAGAAAAAA CAAGAGGAAG AGAGCAATCG AAATCAAACA 60 60 

AATCAGCGTT CATCGCGTCG CTAGGTTTAC CTTTTCTCTT ATAGTT CTTT AGTGACCATG 612 0 

TTTTTACGTT TAATATTTGA CATTTGTTTT TCTTTATGTT ACATCTGCAA TGTAATCGAT 618 0 

TACAAAATAA TTTTTGATGA AGAAGGTAAC ACATATG 6217 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1491 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: 

CTTGACACTT GATTGCGACT GTTGAATCTT ATCTCTCCAA GAAAAACACG TGAAGATGTT 60 

GAGTCTGCTG TCAGCAAGCT TGAAAGTAGC ACATCTGAGA AACATTGGAT CCATCTGCAG 12 0 

TTTCTCGTGG GTCTAGCTTG GATCGTGATG ACAATGGTCT TTTGACTCTT GCTGGCGGTA 180 

AAATCACAGA CTACCGTAAG ATGGGTGACG AGGCGCTATG GAGCGCGTGG TTGACATCCT 24 0 
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CAAAGCAGAA TTTGACCG7A GCTTTAAATT GATCAATTCT AA^ACTTACC CTGTTTCAGG 300 



TGGAGAATTG AACCCAGCAA ATGTGGATTC AGAAATCGAA GCCTTTGCGC AACTTGGAGT 360 
5 TTCACGTGGT TTGGATAGCA AGGAAGCTCA TTACCTAGCA AATCTTTACG GTTCAAATGC 42 0 



ACCGAAGGTC TTTGCACTTG CTCACAGCTT GGAACAAGCG CCAGGACTCA GCTTGGCAGA 4 80 



TACTTTGTCC CTTCACTATG CAATGCGCAA CGAGTTGGCT CTTAGCCCAG TTGACTTCCT 54 0 

10 

TCTTCGTCGT AC C AAC CAT A TGCTCTTTAT GCGTGATAGC TTGGATAGCA TCGTTGAGCC 600 



AGTTTTGGAT GAAATGGGAC GATTCTATGA CTGGACAGAA GAAGAAAAAG CAACTTACCG 660 
15 TGCTGATGTC GAAGCAGCTC TCGCTAACAA CGATTTAGCA GAATTAAAAA ATTAAGAAAA 72 0 



AATAAAAGAG GTGGAGGGCA GCATTCCTTG TCGCCCGTCC CTTCTTTTTA ATGGAGACAG 780 



AAAGAT GAT G AATGAATTAT TTGGAGAATT TCTAGGGACT TTAATCCTGA TTCTTCTAGG 84 0 

20 

AAATGGTGTT GTTGCAGGTG TGGTTCTTCC TAAAACCAAG AGCAATAGCT CAGGTTGGAT 900 



TGTGATTACT ATGGGTTGGG GGATTGCAGT TGCGGTTGCA GTCTTTGTAT CTGGCAAGCT 960 
25 CAGTCCAGCT CATTTAAACC CAGCTGTGAC CATCGGTGTG GCCTTAAAAG GTGGTTTGCC 102 0 



TTGGGCTTCC GTTTTGCCTT ATATCTTAGC CCAGTTCGCA GGGGCCATGC TGGGTCAGAT 1080 

TTTGGTTTGG TTGCAATTCA AACCTCACTA TGAGGCAGAA GAAAATGCAG GCAATATCCT 1140 

30 

GGCAACCTTC AGTACTGGAC CAGCCATCAA G GAT AC T GT A TCAAACTTGA T TAG C GAAAT 1200 



CCTTGGAACC TTTGTTTTGG TGTTGACAAT CTTTGCTTTG GGTCTTTACG ATTTTCAGGC 1260 
35 AGGTATCGGA ACCTTTGCAG TGGGAACTTT GATTGTCGGT ATCGGTCTAT CACTAGGTGG 1320 



GACAACAGGT TATGCCTTGA ACCCAGCTCG TGACCTTGGA CCTCGTATCA TGCACAGCAT 1380 



CTTGCCAATT CCAAACAAGG GAGAC GGAGA CTGGTCTTAC GCTTGGATTC CTGTTGTAGG 14 40 

40 

CCCTGTTATC GGAGCAGCCT TGGCCGTGCT TGTATTGTCA CTTTTCTAAT C 1491 

(2) INFORMATION FOR SEQ ID NO: 54: 

45 (i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 1229 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 liv) ANTI-SENSE: NO 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: 
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ACAACGGATA ATGTCATCGA TCTCTTTGAA CACATCTTTA AGGAATGTTC AACGAAAACA 60 

TTGTGATGGC GGGCAAGGTC AATCTCTTGA ATTTTGCCAA TCTAGCAGCC TA7CAGTTCT 120 

5 TTGACCAACC GCAAAAGGTG GCCTTGGAGA TTCGTGAGGG GTTGCGTGAG GATCAGATGC 18 0 

AAAATGTTCG TGTTGCAGAC GGTCAAGAGT CCTGTTTAGC TGACCTAGCG GTGATTAGTA 24 0 

GTAAGTTCCT CATTCCTTAT CGGGGAGTTG GAATTCTAGC CATTATCGGT CCAGTTAATC 30 0 

10 

T G GAT T AC CA ACAGCTAATC AAT CAAATCA ATGTGGTCAA CCGTGTTTTG ACCATGAAGT 360 

TGACAGATTT TTACCGCTAC CTCAGCAGTA ATCATTACGA AGTACATTAA GATTGAAATC 42 0 

15 ATTAAAGGAG GCGAACATGG CCCAAGATAT AAAAAATGAA GAAGTAGAAG AAGTTCAAGA 48 0 

AGAGGAAGTT GTGGAAACAG C T GAAGAAA C AACTCCTGAA AAGTCTGAGT TGGACTTGGC 54 0 

AAATGAACGT GCAGAT GAGT TCGAAAACAA ATATCTTCGC GCTCATGCAG AAATGCAAAA 600 

20 

TATCCAACGC CGTGCCAATG AAGAAC GTCA AAACTTGCAA CGTTATCGTA GCCAGGACTT 660 

GGCAAAAGCA ATCTTACCAT CTCTTGACAA CCTTGAGCGT GCACTTGCAG TTGAAGGTTT 72 0 

2 5 GACAGATGAT GTGAAGAAGG GCTTGGCGAT GGTGCAAGAA AGCTTGATTC ACGCTTTGAA 78 0 

AGAAGAAGGA ATTGAAGAAA TCGCAGCAGA TGGCGAATTT GAC CATAACT ACCATATGGC 84 0 

CATCCAAACT CTCCCAGGAG AC GAT GAAC A CCCAGTAGAT ACCATCGCCC AAGTCTTTCA 90 0 

30 

AAAAG GCTAC AAACTCCATG ACCGCATCCT ACGCCCAGCA ATGGTAGTGG TGTATAACTA 960 

AGATACAAAC GCTCGTAAAA AGCTCGCAGT AAAAATAGGA GAT T GAC GAG TGTTCGATGA 102 0 

35 ACACAAGAAA ATCTATCTTT TTTACTCAGA GCTTAGGGCG TGTTCGATTC GGCAATTCTG 1080 

ACGGTAGCTA AAGCAACTCG TCAGAAAACG GCAATCGCTA TGACGTTTGC CTAGCTTCCT 1140 

TACTAACTCG TCGTCGAAAT AAAAT CGATT TCGACTCCTC GTGTCGCAAT TTACATAATA 1200 

40 

GAAAACTTGT CCGAACGACA TAAACTATG 122 9 
(2) INFORMATION FOR SEQ ID NO: 55: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5816 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 (iv) ANTI-SENSE: NO 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 
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AAGAAGAAGA CTGTATGGAT AATCGACCAA 
TGACCGTTGT GCGCGAGCTC ATGCGCCAGC 
5 ATTCGGCGCG GGCGCCCTAT GGCCCCCGTC 
AGCTGGTCAA CTTTCTCTTG ACCAAGGATG 
CGACTGCGGT TGTCTGGGAA GAAATCAAGG 

10 

TTTTGCCAGG AGCTTCGGCA GCCATCAAGT 
GAACGCCCAT GACGGTACAA TCAGACATAT 
15 ACTTACAGGT GGAGAGCTTG GCCTGTCCCA 
TGTCAACCAG TGTTACCAAG AAGGTGGTCT 
TGGATAGCCT GATTTTGGGC TGTACTCATT 

20 

TGATGGGGCC AAAGGTTCAG CTCATCGATA 
TCTTACTCAA TTATTTTGAA ATCAATCGTG 
25 TTTACACAAC AGCCAGTAGC CAAAGTTTTG 
AGATTCATGT GGAGCATGTA GAATTATGAC 
GAACTGGTAT GTTGGGTCTT ATAGTATTTT 

30 

GGCAGATTTT CCTCTGTTTG AATTCTCCAA 
GCTTTCAGTT ACTGTTTTAC GCTATGGTTC 
35 CATGCTTAAT CAAGAAATGG GACGAAACTT 
CTTGGTTGAA AATGGGCAAC TCTTGTATGT 
TGATTTCTTT GAGACAAGCA AGGTCAGAGA 

40 

TAAAACCAAG GAATTCCGAG CTATCTTTGA 
TGACTACCCT GACCTGCCTG AAGTAGCAGA 
4 5 CCTTAAGGCA GAAACCATTT CTCAATTAAC 
TCTCAAAGTC GATGTCCTTG GTGGCTTACC 
GGGAGCAACT GACCGTGAAA ATAATGCCAA 

50 

ACTCAAGGAC CGCTCGGCTC AGTTCCACAC 
AAGTTTAGTT GTTGAAGCAG ACTGGTCAGG 
55 TGGCTTTGGC TATGATCCCC TCTTCCTTGT 
AACCCTGGAA GAAAAAAATA GTCAATCTCA 
GGTATTTCCA TCATGGCAAA GCAAACCATC 

60 

TTGATTGTGG AAGAAGTCCG TGATCGCTAT 



-114- 

TTGGTTTTTT GGATTCGGGT GTCGGGGGCT 60 

TTCCCCATGA AGAAATCGTC TATATTGGAG 12 0 

CTGCTGAGCA AATTCGTGAA TATACTTGGC 18 0 

TCAAAATGAT TGTCATTGCT TGTAACACTG 24 0 

CTCAACTAGA TATTCCTGTC TTGGGTGTAA 300 

CCAGTCAAGG TGGGAAAATC GGAGTGATTG 360 

ACCGTCAGAA AATCCATGAT CTGGATCCCG 42 0 

AGTTTGCTCC CTTGGTTGAG TCAGGTGCCC 480 

ATGAAACCCT GCGTCCCTTG GTTGGAAAGG 54 0 

ATCCACTCCT TCGCCCTATT ATCCAAAATG 60 0 

GTGGGGCAGA GTGCGTACGG GATATTTCAG 66 0 

GTCGCGATGC TGGACCACTC CATCACCGTT 72 0 

CACAAATTGG TGAAGAATGG CTGGAAAAAG 78 0 

AAATAAAATT TATGAATATA AGGATGACCA 840 

TGGTGGCGTT AACAGTTTGA GCGACTATAA 900 

AATATTT GGA GAT GAAGAGT ATGGTTTCCC 960 

TACCTACCGT TTGTTCTCCT TTGTGGTAGA 1020 

GGAAGTTATT CAACGTCATG GGGCCCTGCT 1080 

AGAATTGCCT AAAGAAGGGG TCAATGTTCA 114 0 

AACCTTGTTG ATTGCGACTC GTAACGAAGG 1200 

TAAGTTAGGC TACGATGTGG AAAATCTTAA 1260 

AACAGGTATG ACCTTTGAAG AAAATGCCCG 1320 

GGGCAAGATG GTTTTGGCAG ATGATTCTGG 1380 

AGGCGTCTGG TCAGCTCGTT TCGCAGGTGT 14 4 0 

ACTCTTGCAC GAATTGGCCA TGGTCTTTGA 1500 

AACCCTAGTC GTAGCCAGCC CAAATAAGGA 1560 

TTATATTAAC TTTGAACCTA AGGGTGAAAA 1620 

AGGAGAGACA GGTGAGTCAT CAGCTGAATT 1680 

CCGTGCCTTA GCCGTTAAGA AACTTTTGGA 174 0 

ATTGTAATGA GCGATTCCCA TGGCGATAGC 18 00 

GTGGGCAAAG TCGATGCCGT TTTTCATAAC 1860 



WO 98/26072 PCT/US97/22578 



GGCGATTCTG AACTACGTCC GGATTCTCCA 
AACATGGACT TCTACGCCGG CTACCCAGAA 

5 

ATTATCCAAA CTCATGGTCA CTTGTTTGAC 
TGGGCTCAGG AGGAAGAGGC CGCTATCTGC 
10 TGGATGGAAG GCAAGATCCT CTTT CTAAAT 
AT CAGAGAAT GTCTCTATGC TCGTGTGGAG 
TTGACACGAG ATCACGAGGT GTATCCAGGT 

15 

AGGAGTTTGA GACTTTCTTG TTGGGGCAGG 
TAGCTGTGTT GATTGATACC CACAATGCGG 

2 0 CCTATACCCG TGTTCCCGTT GTGACAGATG 

GAGATATTAT GGCTTATCAG ATGGAGCATG 
ATATCGTTCA TAT GACAAAA ACGGACGTAG 

25 

AGGTCTTGCA CAAGCTAGTA GATGAGTCCT 
TCCAAGGGAT TATTACGCGC AAGTCCATCC 

3 0 TTAGTAAGGA ATATGAGATT CGATGCCAAT 

AAAGCAGGGC TTGTCTGTCA ATTCCAAGCA 
AGACATGGTA GGTGAGCGGA TTTCTGAGAC 

35 

CAAT CTAAAA ATCAGCGCCC AGAAGCGAAA 
TCTCTATCAA AAAGGAGAGG TGGACAGCTT 

4 0 AAAGAAGACG GAAAAGCCAG AGATTCTATA 

TCCAGAGGGC CGCTTGCTAG CGCTCTTAAT 
TTTAGCCATC AAGGTTGCGG ACATCAATCT 

45 

TTCCCAACAG AGGATTGTCA CCATTCCCAC 
GGGGCAGACC TAT CTTTTTG AAAGAGGAGA 

5 0 TCAGTTAGAA TCTTTTGTCA GGAGAAGGTT 

CAGTTTATTC TAAGCAAATA GAAACAGGTC 
AAAAACAGTC CTGACCTTAG AAAATATAGA 

55 

AGGACCCTGG ACTTGCTCTT TGCATCTGGT 
TGCCCATTAC GGAAGTCATC GAACAGTATC 
60 GTCTGGAAGT GACGGGTGAG TACATGGTCA 



CTTTGGGAGG GCATCCGCGT TGTTAAAGGG 192 0 

CGTCTGGTGA CTGAGCTTGG TTCGACCAAG 198 0 

ATCAATTTCA ACTTTCAAAA GTTGGACTAC 2 04 0 

CTCTATGGTC ACTTGCATGT GCCAAGTGCT 210 0 

CCAGGTTCTA TCAGTCAACC ACGAGGTACC 2160 

ATTGATGATA GTTACTTCAA AGTGGACTTT 2220 

TTGTCCAAGG AGTTTAGCCG ATGATTGCCA 2280 

AGGAAACTTT TTTGACCCCT GCTAAAAATC 2340 

AT CAT G C G AC CCTCTTGCTC AGT CAGATGA 24 00 

AAAAACAGTT TGTTGGGACG ATTGGACTCA 2 4 60 

ACTTGAGCCA AGAAATCATG GCGGATACGG 2520 

CGGTTGTTTC GCCTGATTTC ACCATTACGG 2 580 

TCTTACCGGT CGTGGATGCA GAGGGTATTT 2 64 0 

TCAAGGCCGT TAATGCCCTC TTGCATGACT 2700 

GAGAGACAGG ATTTCAGCCT TTTTAGAGGA 27 60 

GTCCTATAAG TATGATTTGG AGCAATTTTT 2 820 

CAGTCTCAAG ATTTACCAAG CCCAGCTAGC 28 80 

GATTTCGGCC TGTAACCAAT TTCTATACTT 2 94 0 

TTACCGCTTG GAATTAGCTA AACAAGCTGA 300 0 

CCTAGACTCT TTTTGGCAGG AAAGCGACCA 3060 

CCTAGAAATG GGGCTCTTGC CCAGTGAGAT 312 0 

GGATTTTCAG GTGCTGCGAA TCAGCAAGGC 3180 

GGCCTTGCTT TCAGAATTGG AACCCTTGAT 324 0 

GAAACCCTAT TCTCGTCAGT GGGCCTTTCG 3300 

TCCATCCTTA TCAGCTCAAG TCTTACGTGA 33 60 

GATTTGTACG AATT GCAAAA AATTAGGATT 3420 

TAAT GGATAT TAAATTAAAA AGATTTTTGA 34 80 

TTCTAAGTAC CAAGATGGAT ATCTACGATG 354 0 

TAGCCTATGT TTCAACCCTG CAGGCCATGC 3600 

TGGCTAGTCA GCTCATGCTG ATTAAGAGTC 3660 
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GTAAACTCCT TCCGAAGGTA GCAGAAGTGA 
TCCTCTCTCA AATCGAAGAA TATCSCAAGT 
5 AGCACCA^GA ACGGGCCCAG TATTATTCCA 
CGGAGCTTGT GCATGACAAG ACGACCATTG 
CCAAGAAAAA AGAGGAGTTT GCACAAAATC 

10 

TTGAGGACAT GAT GAT TAT T GTGAAAGAAT 
AGGATTTGTT CAAGGAAGCC CAGAATGTCC 
15 TAGAGTTAAT CAAAACCCAG GAGCTGATCC 
ATCTCATGGA AAAGAAG GAA GAAAGTCAAG 
A GAT GAGTAC TTTAGCAAAA ATAGAAGCGC 

20 

GGGTCCGCCA GTTAGCTGAA CTCCTCTCTC 
GAAAATTAGC CCAGAAGTAT GAAAAGGACC 

2 5 GTGGTGCTTA TAGATTGGTG ACCAAGCCTC 

AGGCGCCTAT CAACCAGAGC TTGTCTCGGG 
ACAAACAGCC GATTACGCGG ATAGAAATTG 

30 

CCTTGGCAAA GTTGCAAGCT TTTGACCTGA 
GGCGCCCCAA CCTCTATGTG ACTACGGATT 

3 5 TAGAAGAATT ACCAGTGATT GATGAGCTTG 

GTGAAAGGAT AGAAGAAGAT GAGAAT CAAT 
AGGAGAAAAG CAGAAGAGCT GATCAAGCAA 

40 

CGTGAACTAG CAACCACTAT CAAGTCAGGC 
TACAAC GAAG AAAAGGTCTA TTATCTGCTT 

4 5 ACAGATGACA AGGGTCGCAA GACGGTTGTC 

TACCCTGTGG GTCGTTTGGA CTGGGATACA 
GACTTTACAG ACGAGATGAT TCACCCTCGT 

50 

GTTAAAGGTG TGGCCAATAA GGACAATCTT 
GGTAAGAAAA CCAAGCCAGC TGTTTATGAA 
55 TCTGTGGTGC AGTTGACCAT CCATGAAGGG 
GCTGTTGGTC TCCAAGTAGA TAAGTTGTCT 
GACTCCGTCC AG GAGAAT CC CGTCGTCTTA 

60 

TGGCTGTAAC TAAGAAATAA TGAAACGAAT 



-116- 

CAGACTTGGG GGATGACCTG GAGCAGGACC 3 72 0 

TCAAGCTCTT GGGTGAGCAC TTGGA^GCCA 2760 

AAGCGCCGAC AGAGTTGATT TACGAAGATG 3 84 0 

ACCTCTTTTT GGCTTTTTCA AATATCCTAG 3 900 

ACACGACGAT CTTGCGGGAT GAGTATAAGA 3960 

CCTTGATTGG ACGAGATCAA TTGCGCTTGC 4 02 0 

AAGAGGTCAT CACCCTCTTT TTGGCAACCC 4 080 

TCGTGCAAGA GGAGAGTTTC GGAGATATCT 4140 

TGCCTCAAAG CTAGACTTGA TAGAGAGGAA 4200 

TCTTGTTTGT AGCGGGTGAA GATGGGATTC 4 260 

TGCCACCGAC AGGCATCCAG CAGAGTTTAG 4 32 0 

CAGATTCCAG TTTGGCTTTG ATTGAGACAA 4 380 

AATTTGCAGA GATTTTGAAG GAATACTCTA 4 440 

CTGCCCTTGA GACCTTGTCC ATTATTGCCT 4 500 

ATGCCATCCG TGGGGTTAAC TCGAGTGGAG 4560 

TAAAGGAAGA CGGGAAAAAG GAAG TAT TGG 4 620 

ATTTCCTAGA TTACATGGGG ATAAACCATT 4680 

AGATTCAAGC CCAAGAAAGC CAATTATTTG 4 74 0 

AAGTATATTG CCCACGCAGG TGTGGCCAGT 4800 

GGTTTGGTGA CAGTTAACGG ACAAGTGGTG 4 860 

GACAAGGTCG AAGTTGAAGG TCAACCTATC 4 920 

AACAAACCAC GCGGTGTCAT TTCCAGTGTA 4 98 0 

GACCTCTTGC CCAATGTCAA AGAGCGCATT 5040 

TCAGGAGTCT TGATTTTGAC CAATGATGGG 5100 

AATGAGATTG ACAAGGTTTA TGTCGCGCGT 5160 

CGCCCCTTGA CCCGTGGTCT TGAGATTGAT 5220 

ATTCTCAAAG TGGACCCAGT CAAAAATCGC 52 8 0 

CGTAACCATC AGGTTAAAAA GATGTTTGAA 534 0 

CGGACTCGTT TCGGACACCT AGACTTGACA 5400 

ATAAAAAAGA AATCAGCCAA CTACACACCA 54 60 

TTTAATAGCG CTTGTGCGCT TTTACCAACG 552 0 
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TTTTATCTCA CCAGTCTTTC CACCCTCTTG TCGCTTTGAG CTGACTTGTT CCAACTACAT 5580 

GATTCAGGCT ATTGAAAAAC ATGGTTTTAA GGGGGTATTG ATGGGCTTGG CTCGGATTTT 564 0 

5 

ACGTTGTCAT CCCTGGTCGA AAACAGGTAA GGACCCCGTT CCAGACCACT TTTCCCTTAA 5 7 00 

ACGAAATCAA GAAGGGGAAT GAGGTGGGGT AAATAGATTT CAAAAT GAT A AAAACGCATC 5760 

10 CTATCAGGTT TGAGTGAACT TGATAGGATG CGTTTTAGAA TGTCAAAATT TTATAC 5 816 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

30 

TTGAAAATAA TTATGAACCG CAATATATTA ATATCCGAGG AAAAGGCCCT CTTATCAATG 60 
ACTTGAAAAA AGAAGCTAAA AAAGCTAATA AAGTTTTTCT CGCGAGTGAC CCGGACCGTG 12 0 

3 5 AAGGAGAAGC GATTTCTTGG CATTTGGCCC ATATTCTCAA CTTGGATGAA AATGATGCCA 180 

ACCGTGTGGT CTTCAATGAA ATCACCAAGG ATGCAGTCAA AAATGCTTTT AAAGAACCTC 24 0 

GTAAGATCGA TATGGACTTG GTCGATGCCC AACAAGCTCG TCGGATCTTG GATCGCTTGG 3 00 

40 

TAGGGTATTC GATTTCGCCT ATTTTGTGGA AGAAGGTCAA GAAGGGCTTG TCAGCAGGTC 3 60 

GCGTTCAGTC CATTGCCCTT AAACTCATCA TTGACCGTGA AAATGAAATC AATGCCTTCC 42 0 

45 AGCCAGAAGA ATACTGGACA GTTGATGCTG TCTTTAAAAA GGGAACCAAA CAATTTCATG 4 80 

CTTCCTTCTA TGGAGTAGAT GGTAAAAAGA TGAAACTGAC CAGCAATAAC GAAGTCAAGG 54 0 

^ AAGTCTTGTC TCGTCTGACG AGTAAAGACT TTTCAGTAGA TCAGGTGGAT AAGAAAGAGC 600 
GTAAGGCAAA TGCTCCTTTA CCCTATACCA CTTCATCTAT GCAGATGGGA TGCTGCCAAT 660 
AAAATCAATT TCCGTACTCG AAAAAC CAT G ATGGTTGCCC AACAAGCTCT ATGAAGGAAT 72 0 

55 TATAT 725 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS : 
60 (A) LENGTH: 1935 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(iil! HYPOTHETICAL : NO 
(iv) ANTI-SENSE : NO 

10 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO:57: 
15 AACCCATCTG AAGAACTTTT CCGTGCTGCT AGCTCAGCTA TCGATAAAGC AGAAACTAAA 60 
GGTTTGATTC ACAAAAACAA AGCAAGCCGC GAT AAAG CT C GTCTTTCAGC TAAACTTGCT 12 0 



AAATAAGAAA CAGTCCATAG AGGCTGTTTT TTTGTCTCCA AATAGGAAAA GGTAGAAAAT 18 0 

20 

GAAAATCACA ATTATCGGAT ATTCTGGTTC TGGTAAGTCA ACT CT AG CAG AAAAGTTATC 24 0 



TAACTACTAC TCCATTCCAA AACTGCACAT GGACACACTC CAATTTCAAC CTGGTTGGCA 300 

2 5 AGACAGTGAC TGCGAATGGA TGTTAACCGA GATAAAAAAC TTTCTCACCA AGCATAAAGC 360 

TTGGGTCATC GATGGTAATT ATTCTTGGTG CTACTAC CAA GAACGAATGC AAGAAGCTGA 42 0 

CCAAATCATC TTTCTCAATT TTTTGCCATT GACCT GTCTC TTTAGAGCCT TTAAGCGTTA 48 0 

30 

TCTTAAATAC CGTGGAAAAG TCAGAGAAAG TATGGCGGCA GATTGCCCTG AACGCTTTGA 540 



GTGGGAGTTT ATCAGATGGA TTCTTTGGGA TGGGCGTAGC AAAACTCAAA AAGAAAATTA 600 

3 5 CCAAAAACTT TGCCAAGAAT ATTCACATAA AGTCACTATC CTTCGAAATC AGAGAGAGCT 660 

AGATCAATTT CT GGATAAGA AAAG GAAGT C CTACAATTCA TAAAGGGCTT CCTTTTTGGC 72 0 

TATAATTATT CTGCAATCAA GGTTTCCAAA CCAACCTTCA TCATATCAGT GAAGGTATTT 780 

40 

TGACGTTCTT CTGCAGTTGT GTCTTCGTCT GGATTGACCA AGCTATCAGA GATGGTCATG 840 



ATAGCTAGCG CAT CAA CAT G GTATTGGGCA GCAAGATAGT AAAGAGCTGC TGCTTCCATT 900 
45 TCCACAGCCT TGACTCCCCA TTTACCAAGC TCGATATTCT TTT CAAAGTA ATTTGAGTAA 960 
AAGACAT CA G ATGACAAAAC GTTCCCAACG TGAGTAGTCA TACCAAGTTC TTTGGCGATA 102 0 



TGGTAGGCTT TATCAAGCAA ATCAAAGCTA GCAATTTGTG GAAAATCGTA CTGTGGCCAG 1080 

50 

TCATTACGAA CGATGTTTGA GTTGGTTGCA GCCGCCTGCG CCAAAACTAA TTCACGAACA 114 0 

TGAACCTCTT CATTCAAAGA ACCTGCAGTT CCCACACGAA TCAATTTCTT CACACCGTAG 12 00 

55 TCTACGATTA ACTCACGCGC ATAAATCGAA ATAGATGGCA TTCCCATCCC AGTTCCCATG 12 60 

ACAGATACAC GGTGACCCTT GTAAGTACCA GTGTAACCAA ACATGTTACG CACTTCGTTA 1320 

AAACAAACAG CATCACCAAG GAAATTCTCC GCAATAAACT TAGCACGAAG AGGATCCCCA 1380 

60 

GGAAGAAGAA TTTTATCAGC AATTTCACCC TGCTGAGCAG CAATATGGAT AGACATAATT 14 4 0 
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TAT GATACAA AGAGCGAGAA GAAAACGACT GAAAATTAGG AACCTGACGA GAAATCCTGA 150 0 

TTTTTCAGTC AGATTATCTA TTTTCCGAGT TTTCCGCTCG TGTTCAAATC AAAACACACG 15 60 

5 

CTCTACCTTT CTTTATTTTA TATTTTATAT TGAGAAAGAT ACCAAACCCA TCAAAAAGCG 162 0 



AAGGGAAAAT AGGAGTTGGG CGCAGTGAGC GATGCTCGCT AGACCAACTA TCTTTTTCCC 168 0 

10 ACTGCTTT7A GGGTGGGGTC AATTCCTTTC TTTCTTAATT TTGATTTAGA GGAGAGTCGC 1740 



CCGTATTCAG TTCAGCGAAT ACAGTTTACC CATCCTTTCG TTTTTATT7T TAGAAAAG7T 18 00 



TTCTACTCGT GTT CAAATTA GAACACGCGC TCTACCTTTC TGTTTATACT CTTCGAAAAT 18 60 

15 

CTCTTCAAAC CACGTCAACG TCGACTTGGA TTATATATGT GACTGACT7C GTCATCT7TA 192 0 



TCTACAACCT CAAAG 19 35 

2 0 (2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2221 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

3 0 (iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 8 : 

TATTATTTTT CCCATCCTAA CTGGAACCTA TGTCGCGCGT GTCTTGGACC GAACT GACTA 60 

40 

TGGTTACTTC AACTCAGTCG ACACTATTTT GTCATTTTTC TTGCCCTTTG CAACTTATGG 120 



TGTCTATAAC TACGGTTTAA GGGCTATCAG TAATGTCAAG GATAACAAAA AAGATCTTAA 180 
45 CAGAACCTTT TCTAGTCTTT TTTATTTGTG CATCGCTTGT ACGATTTTGA CCACTGCTGT 24 0 

CTATATCCTA GCCTATCCTC TCTTCTTTAC TGATAATCCA ATCGTCAAAA AGGTCTACCT 300 



TGTTATGGGG ATTCAACTCA TTGCCCAGAT TTTTTCAATC GAATGGGTCA ATGAAGCTCT 360 

50 

GGAAAATTAC AGTTTTCTCT TTTACAAAAC TGCCTTCATC CGTATCCTGA TGCTGGTCTC 42 0 

TATTTTCTTA TTTGTTAAAA ATGAACACGA TATTGTTGTC TATACACTTG TGATGAGTTT 4 80 

55 ATCGACGCTG ATTAACTACC TGATTAGTTA TTTTTGGATT AAAAGAGACA TCAAACTTGT 540 



TAAAATTCAC CTAAGTGATT TTAAACCACT CTTTCTCCCT CTGACAGCCA TGTTAGTCTT 600 

TGCCAATGCC AATATGCTCT TCACTTTTTT AGATCGCCTC TTCCTCGTTA AAACAGGGAT 660 

60 

TGATGTCAAC GTTAGTTACT ATACCATAGC TCAGCGAATT GTGACCGTTA TAGCTGGGGT 72 0 
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TGTAACAGGT GCAATTGGAG TGAGTGTGCC TCGTCTCAGT TACTATCTGG GGAAAGGAGA 7 80 

CAAAGAAGCC TATGTTTCTC TGGTTAATAG AGGTAGTCGA ATCTTTAACT TCTTTATCAT 840 

5 

TCCACTGAGT TTTGGACTCA TGGTTTTAGG ACCAAATGCC ATCCTACTTT ACGGTAGTGA 900 

AAAAT AT AT C GGAGGCGGCA TCTTGACCTC TCTCTTCGCT TTTCGTACGA TTATCCTGGC 960 

10 CTTAGATACC ATT CTTGGTT CCCAAATTCT CTTTACCAAT GGCTATGAAA AACGTATCAC 1020 

AGTCTATACA GTCTTTGCTG GGCTACTCAA TTTGGGCTTG AATAGTCTCC TTTTTTTCAA 10 80 

CCATATCGTG GCTCCTGAAT ACTACTTACT GACAACTATG CTATCAGAGA CTTCTCTACT 114 0 

15 

TGTTTTCTAT ATCATTTTCA TCCATAGAAA ACAACTCATC CACTTGGGAC ATATCTTTAG 12 00 

CTATACTGTT CGATACTCTC TCTTTTCACT TTCCTTTGTA GCAATTTATT TCCTGATTAA 1260 

2 0 TTTCGTGTAT CCTGTAGATA TGGTCATTAA TTTGCCATTT TTGATTAATA CTGGTTTGAT 132 0 

TGTCTTGCTA TCAGCTATCT CTTATATTAG TCTACTTGTC TTCACAAAAG ATAGCATTTT 1380 

CTATGAATTT TTAAACCATG TCCTAGCCTT AAAAAATAAA TTTAAAAAAT CATAGGAGTT 14 40 

25 

TAAAATGAAA CAACTAAC CG TTGAAGATGC CAAACAAATT GAATTAGAAA TTTT GGATTA 1500 

TATTGATACT CTCTGTAAAA AGCACAATAT CAACTATATT ATTAACTACG GTACTCTGAT 15 60 

3 0 TGGGGCGGTT CGACATGAGG GCTTTATCCC TTGGGACGAC G AT ATT GAT C TGTCCATGCC 1620 

TAGAGAAGAC TACCAACGAT TTATTAACAT TTTTCAAAAG GAAAAAAGCA AGTATAAGCT 1680 

CCTATCCTTA GAAACT GATA AGAACTACTT TAACAACTTT AT CAAGATAA CCGACAGTAC 17 4 0 

35 

GACTAAAATT ATTGATACTC GAAATACAAA AACCTATGAG TCTGGTATCT TTATCGACCT 1800 

GCAGGCATGC AAGCTTGGCG TAATCATGGT CATAGCTGTT TCCTGTGTGA AATTGTTATC 18 60 

4 0 CGCTCACAAT TCCACACAAC ATACGAGCCG GAAGCATAAA GTGTAAAGCC TGGGGTGCCT 192 0 

AATGAGTGAG CTAACTCACA TTAATTGCGT TGCGCTCACT GCCGGCTTTC CAGTCGGGAA 1980 

AC CTGTCGTG CCAGCTGCAT TAATGAATCG GCCAACGCGC GGGGAGAGGC GGTTTGCGTA 204 0 

45 

TTGGGCGCCA GGGTGGTTTT TCTTTTTCAC CAGTGAGACG GGCAACAGCT GATTGCCCTT 2100 

CACCGCCTGG CCCTGAGAGA GTTGCAGCAA GCGGTCCACG CTGGTTTGCC CCAGCAGGCG 2160 

5 0 AAAAT CCTGT TTGATGGTGG TTCCGCAAAT CGGCAAAATT CCTTATAAAA TCAAAAGGAA 2220 

T 2221 
(2) INFORMATION FOR SEQ ID NO: 59: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1509 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
60 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(111) HYPOTHETICAL: NO 
5 (iv) ANTI-SENSE: NO 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

TGAATTTTGA ACAGTACACA GAATACTAAA ATATTTCTAG AAATTAATTT GAATTTTCTA 6 0 

ATTGGATTTG TCGCATCTTA TTTCAATCTA CTATAGAAAA AAGTCTTTAA AATTATAAAA 12 0 

15 

CGCATCATAT CAAGGTTTTT CAAAAACCTT GAT AT GAT GC CTTTTATTGT GGGAATATGT 18 0 

ATTTCATTTT CTACTAAAAT TATGTTTTTG AATAACCTCT ATCTTAGTAG TTTGTATAAT 24 0 

2 0 CCCCCTCAAT CAGTTTTTGC GATAAGCTTT AATGCTATGA CTATACCACT CTTGCATTTC 300 

TTTTGATGGT GGTGTCATAT AATCGCCATA CATCTGGGTT AAAAATT GGT CATATTTTTT 360 

GGGAACAGGC AACATACGGC CCTCAAACTC AGTTAAAATC AGTTCTTTAA AGGTATCAAC 42 0 

25 

TGGGAAGATT TCTTTCATCC CTTCCTTACC GATCCCAACT CCTCCTTCAT ATTGAGGAGT 48 0 

GTTGGTTACA GCATTTTTGA CTAGTTGATC AATTTTCTTG TAAAAGTAGC GAGGATTGAC 54 0 

3 0 AAATCGGAGA GCGTACCAGC TACATAATCT AAGAAAATCT TTTAGTTTGC TATCACCGTG 600 

AACTGCTCGT GATTTTTTGA TATAAGCTAG TTGACGAAGA GCCACATACT TATAGCTCTT 660 

GTCGACAATG CTCAAGT CTG TAAATCGATC AATTGGGAAG ACATCGATGA AAAGGCTGGT 72 0 

35 

AT CAT GAC G C TTGTACTTAA CATGGTCTTC TATAACAGTA GAAGTGTCCA AAATCGATGC 780 

GAAATTATGG AAGTACCAAG AAGATGTATC GTAGGAAAGA ACCTTGTAGC GAGGGTGATT 84 0 

4 0 TTCTTCTTCA ATAATCTTCA GTAAACGCTC ATAATCCTCA CGATAAAGGG AAAT AT CAAT 900 

ATCATCATCC CAAGGAATCA TACCTTTGTG GCGGATGGCT CCAAGCATGG TTCCATAACT 960 

GAGAAAATAA G G AAT AT CAT GTTTCTTACA AGTCTCATCA ATATAGTCCA GCAGGGCTAG 102 0 

45 

TTGAATTTCT TTAATTTCTT TTTTTTCTAA ATATTGCATC CTAATCCTCC AATTTATAAG 108 0 

CGTGAAATTC ATGACTGTAG AAGCGTTTTT CTTCTGGTGG TAGGGTCATA TAATCTCCAT 114 0 

5 0 AAAATTGTGT CAAAATAGTA TCAAATTTTT CAGGTGCAGG AAGGCTTAAA TTCTCAAAGG 1200 

GTAAATCGAT TGTTTTATCA AAGGTACCAC TTGGGAAGAC TTCCTTTTCC TTAAATTTTG 12 60 

AAG G GAT AAA AG C CAT AT AT TGCCCATTTT CACGACTATA TTTTTGAATT TCTTTCTCGA 1320 

55 

TTTTATTTGC AAAATAACGA G GAGAAAC C G GTCGAAGGAG TAACCAGAAG GCTGTTCGTA 1380 

TCCAATCTTT TAAAAGGCTA TCCTTATAGA CAATATTTTT ATGTTTACTG AAAGACAGCA 144 0 

6 0 GTTTGAAGCT TTCCAGTTTA TAACAAGTAT CAATGACCTT AGGATCATCA AAGCGATCTA 150 0 
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TAGGGAAAA 

12) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ACAAGGGATT TATTCCTTGG GACGACGACC TAGACTTTTT TATGCCTCGT AAAGATTATG 60 

AGAAATTAGC AGAATTATGG CCTCGTTATG CAGATGAACG TTATTTCTTG TCAAAGAGTC 12 0 

ACAAGGATTT TGTTGATCGT AATCTTTTTA TTACCATTCG TGACAAGAAA ACCACCTGTA 180 

TCAAGCCTTA TCAGCAGGAT TTGGATTTGC CACATGGTCT GGCCTTGGAT GTTTTGCCTT 24 0 

TGGATTATTA TCCGAAAAAT CCAGCTGAGC GGAAAAAACA GGTTCGTTGG GCCTTGATTT 300 

ATTCACTCTT TTGTGCGCAA ACTATTCCAG AAAAGCATGG TGATCTCATG AAATGGGGAA 360 

GTCGCATTTT ACTGGGTTTG ACTCCAAAAT CTCTCCGTTA TCGCATCTGG AAAAAAGCTG 42 0 

A GAAAGAAAT GACTAAGTAT GATTTGGCTG ATTGTGATGG CATTACAGAA TTATGCTCAG 48 0 

GTCCTGGCTA CAT GAGAAAC AAGTACC CAA TCACATCTTT TGAAGACAAT CTTTTCTTGC 54 0 

CATTTGAAGG AACAGAGATG CCTATTCCAA TCGGCTATGA TGTCTATCTC AGAACTGCTT 600 

TTGGGGATTA TATGACGCCT CCACCAGCAG ACAAGCAGGT AC CG CAT CAT GATACTGTCA 660 

CTGCTGATAT G 671 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2766 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
ATCTTATACA AGTCGTAAGC CGCTTCCTTA AAACCAGCTT CTAGTAATTC TTCCAATAAG 60 

ATAGTAACCT TCACACCATT TGGTGTTCCC AG T GAATAAA GCTGAAAAGC TTGTTCTCCT 12 0 

TTTGGCAAGT TTTGTTCGAA ACGGGCACCT GCTGTTGGTC TGTTTAGCCC CGTAAAAGCT 18 0 

CCTTGATTAC TAGCTTCATC CTGCCATACG GTCGGTAATT GATATGCTGA CATCCGAGAC 240 

CTCCCTTAAA TCGCATTCTT GTCAAAACCG AGTTTGCGTT GAATAAACTT AACGATTTCG 30 0 

AC GAT GAT AA TCATTGAGAA GCTTCCAGCC ATAACAATTC CCCATTGTGA CAAGTCTAGT 360 

TTGGTTACGT GGAAGATTCC TTCAAGCGGT TCTACAACGA TTGTTGCCAT GAGAAGGATA 42 0 

AAGGATACCA AGATGGACCA GTTAAAGGTC TTAGACTTGA ATGGGCCAAC TGTCAAGATG 480 

GATTGGTAGA CAGACTTGAC ATTGTAGGCA TGGAAGAGCT GAAT CAAACC AAGGGTTGCA 54 0 

AAGGCCATCG TTAGGGCATC TGCATGAATA GCATGATTGT CACCCACATG AACTGGGTAA 600 

GCAATCGCAA GGCCATAAAC ACTCATAACA AGAGCTGCTT GGAGTACACC TTGATAAATG 660 

AT AGAACT CA AAACACCACC TGAGAAGAAG CTTGCCTTGC GTCCACGTGG TTTATGATTC 720 

AT GACAC C AG GTTCCGCAGG TTCAACACCA AGAGC GAT AG CTGGGAAGGT ATCCGTTACC 780 

AAGTTGATCC ACAAAAGATG AACCGGCTGC AAGACATCCC AACCAAACAA GGTTGATAGG 84 0 

AAGATGGTTA ATACTTCAGC AGTATTAGCA GAAAGTAGGT ACT GAATAGT CTTTTGAATG 900 

TTTGAGAAGA CCTTACGTCC TTCTTCCACT GCGACGATAA TAGTCGCAAA GTTATCATCT 960 

GCAAGAATCA TAT CAGAAG C CCCCTTAGAA ACCTCTGTAC CAGTGATTCC CATACCGATA 1020 

CCAATATCGG CTGTTTTCAG AGCTGGCGCG TCATTGACAC CGTCACCTGT CATGGCAACG 1080 

ACTTTACCTT GTTTTTGCCA AGCCTTGACG ATACGAACCT TGTGTTCTGG AGACACACGG 114 0 

GCATAAACAG AGTATTGACC AACAACTTTT TCAAATTCTT CAT C T GACAG TTCATTGAGT 12 00 

TCAGCACCAG TTAAAACGTG ACCTTCTGTA TCGTTTGCGT CAATGATTCC CAAACGTTTG 12 60 

GCAATGGCTT CCGCTGTGTC TTGGTGGTCA CCTGTAATCA TAATTGGACG GATTCCCGCT 132 0 

TCCTTAGCCA CACGAACAGC CTCAGCGGCT TCAGGACGTT CAGGGTCAAT CATCCCAATC 138 0 

AAACCAGTAA AAATTAAATC ATTTTCAAGC TCTT CAGAAG TGAGATTTTC TG GAAT ACT A 1440 

TCGATAATCT TATAAG CACC TGCAAGGACA CGCAAGGCTT GATGAGCCAT TTCAGAATTG 1500 

TTTGTATGAA TGAGATTTGT AACCTTCTCA TCAATCGGAG CAATATCCCC AG C C TT AT CA 1560 

CGAAGAAGAC AACGTTTTAA GAGTTGGTCT GGCGCACCCT TGACTGCTAC AAGGAAACGA 1620 

CTATCTGGCA ATGGGTGAAC TGTTGACATG AGCTTACGGT CAGAGTCAAA TGGCAATTCA 1680 

GCTACACGAG GATATTTCTC TAAGAAACCT TTGACATCAT AGCCCTTGTC CAAGGCATAT 174 0 
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TGGATAAAGG CTGTTTCGGT TGGGTCACCA AT CAAGTT AC CTTCCACATC GATTTTCGTA 18 00 

TCATTGGCCA AGACAACTGA ACGAAGTAGT GGCATTTCAA GACCTAGTTC AATATCATCA 18 60 

GCTGAGTCAT GTAGAACCGC ATCGTAGAAG ACTTTTTCGA CTGTCATCTT GTTCATAGTC 192 0 

AGCGTACCAG TCTTATCAGA AGCGATGATT TCAGTTGAAC CAAGTGTTTC AACTGCTGGC 198 0 

AACTTACGAA CGATGGAATG TCGTTTGGCC AAAACTTGAG TACCAAGAGA AAGAACGATG 2040 

GTAACGATAG CAGGAAGTCC TTCTGGAATG GCTGCAACAG CAAGGGCAAC AGAAGTCAAC 2100 

AACTCACCAA GTGGATTTTT CCCTTGAATG AAGACACCCA CTACAAAAGT AACAAGGGCA 2160 

ATGACCAAGA TAGCATAGGT CAAGACCTTA GAAAGGTTGT TCAAGTTTTG TTTGAGTGGT 2220 

GTATCAGTCT CATCCGCATC TTGAAGCATA CCAGCAATAT GACCAACTTC AGTATACATA 2280 

CCTGTATTGA CAACAACACC CATCCCACGA CCATAGGTTA CGTTTGAGTT TTGGAAGGCC 2 34 0 

AT GTT GACAC GGTCACCAAT GCCAGCATCT GTCGCAAGAT CGACTGACAA GTCTTTTTCG 2400 

ACTGGTACAG ATTCACCTGT CAAGGCTGCT TCTTCAATTT TAAGAGAGTT GGCTTCTATC 2 4 60 

AAACGTAGGT CCGCTGGTAC CACGTCACCT GCTTCAAGGG CAACGATATC GCCTGGTACC 2520 

AATTCTTTAG AGTCAATCTC TGCCATGTGT C CAT CAC GAA GAACGCGGGC AACTGGACTA 2580 

GACATGGATT TGAGGGCTTC AATAGCTTCT TCAGCTTTTC CTTCTTGGTA AACACCAAAG 2640 

GCAGCGTTGA TGATAACCAC AGCTAGGATG ATAATGGCAT CTGCGATATC TTCCCCACCA 27 00 

GAAGTCACGA CTGACAAGAT TCTGCCGCAA CTAGGATGAT AATCATCAAA TCCTTAAATT 27 60 

GCTCGA 2766 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1577 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

TGGATTTATC CTCTTTTTCG TTCTTTTGGG AGCAGTTTTT GAGGAAAAAA TGAGAAAAAA 60 

TACGTCCCAA GCTGTGGAGA AATTACTGGA CTTGCAAGCT AAAACCGCAG AAGTCTTGAG 120 

T GAT GATAGT TATGTCCAAG TTCCTTTGGA ACAAGTCAAG GTAGGCGACC TGATTCGAGT 180 
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GCGTCCCGGT GAAAAGATTG CTGTTGATGG TGTCGTAGTA GAAGGTGTCT CTAGTATTGA 240 



CGAATCCATG GTGACAGGTG AGAGTCTGCC TGTGGACAAG ACAGTTGGAG ATACT GT CAT 300 
5 TGGCTCAACC ATCAATCATA GTGGAACGCT TGTCTTTAGA GCAGAAAAAG TTGGCTCAGA 360 



GACTGTTTTG GCTCAGATTG TGGATTTTGT GAAGAAAGCT CAGACAAGTC GTGCGCCGAT 420 



TCAGGACTTG A C G GATAAGA TTTCAGGGAT TTTT GTCCCA GTAGTTGTCA TTTTAGGAAT 480 

10 

CATGACCTTT TGGGTTTGGT TCGTCTTGCT CAGGGATAGT GTGGTCGTGC TTGGAGCTAG 540 



CTTTGTGTCC TCTCTTCTCT ACGGAGTGGC GGTTTGATTA TCGCCTGTCC TTGTGCCTTG 600 
15 GGACTTGCAA CACCGACAGC CCTTATGGTG GGGACAGGAC GTAGTGCCAA GATGGGGGTT 660 



CTCCTCAAAA ATGGAACTGT CTTACAGGAA ATCCAGAAAG TTCAAACTCT TGTCTTTGAT 72 0 



AAGACCGGGA CTTTGACGGA AGGGAAACCT GTGGTAACAG ATATCATCGG CGACGAAGTA 7 80 

20 

GAAGTGTTTG GATTGGCAGC CTCCTTGGAA GATGCTTCTC AACACCCACT GGCTGAGGCT 840 



ATCGTTAAGC GAGCGAGTGA AGCTGGACTT GAGTTTCAAA CTGTTGAAAA TTTTCAGGCC 900 
2 5 TTGCACGGGA AAGGTGTTTC AGGGCGAATC AATGGAAAAC AAGTTTTACT TGGAAATGCT 960 



AAAATGCTGG ATGGCATGGA TATTTCTAAT ACTTATCAAG AT AAAC T AGA AGAACTAGAA 1020 



AAAGAAGCTA AGACAGTTGT GTTTTTAGCT GTTGACAATG AAATCAAAGG CTTGCTTGCT 1080 

30 

TTGCAAGATA TTCCTAAGGA AAATGCTAAG CTAGCCATCA GTCAGCTAAA AAAACGTGGT 1140 



CTCCGAACAG TCATGCTGAC AG GAGACAAT GCTGGTGTGG CGCGTGCTAT TGCAGATCAA 1200 
3 5 ATCGGAATTG AAGAGGTCAT TGCAGGCGTC TTGCCAGAAG AAAAAGCCCA T G AAAT C CAT 12 60 



AAACTGCAAG CGGCTGGCAA AGTAGCCTTT GTTGGGGACG GTATCAATGA CGCTCCTGCC 1320 
CTTAGT GTAG CAGATGTGGG AATTGCTATG GGTGCTGGAA CAGATATCGC CATCGAGTCA 138 0 

40 

GCAGATTTGG TGTTGACAAC CAATAATCTT TTAGGAGTGG TTCGTGCCTT TGATATGAGT 14 4 0 

AAGAAAACCT TTCATCGAAT TCTACTCAAT CTTTTCTGGG CTTTTATCTA CAATGTTGTC 1500 
4 5 GGAATTCCGA TTGCAGCAGG AGTCTTTTCA GGTGTTGGCT GGCTCTCAAC CCAGATTGGC 1560 
AAGGCTAGCC CAATGGC 1577 
(2) INFORMATION FOR SEQ ID NO: 63: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1089 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

60 

(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



AAAAT GAT AT AATAGAATTT ATGGATAAAA ATAAGATTAT GGGATTAACC CAAAGAGAAG 
TCAAGGAAAG ACAGGCT GAG GGTTTGGTCA ATGACTTTAC CGCATCAGCC AGTACCAGCA 



CTTGGCAAAT CGTTAAACGA AATGTCTTTA CCCTTTTTAA CGCTTTGAAC TTTGCCATTG 

CTTTGGCCCT TGCCTTTGTG CAGGCTTGGA GCAATCTGGT CTTCTTTGCT GTTATCTGCT 

15 TTAACGCTTT TTCTGGGATT GTGACCGAGC TACGAGCCAA ACACAT GGT G GACAAGCTCA 

AT C T CAT GAC CAAGGAAAAG GTCAAAACCA TCCGTGATGT CAGGAAGTTG CTCTTAATCC 

TGAAGAATTA GTGCTAGGAG ATGTCATTCG TTTGTCTGCA GGAGAGCAGA TTCCTAGTGA 



TGCCTTGGTT TTGGAAGGCT TTGCGGAAGT CAATGAAGCC ATGTTAACGG GAGAAAGTGA 
TTTGGTGCAA AAGGAAGTTG ACGGCTTACT TTTGTCAGGA AGTTTCCTAG CCAGTGGGTC 
25 AGTTTTATCT CAAGTTCACC ATGTCGGTGC AGACAACTAT GCTGCCAAAC TCATGCTTGA 
GGCTAAGACC GTTAAACCCA TCAACTCCCG TAT CAT GAAA TCGCTGGACA AGTTGGCTGG 
TTTTACTGGG AAGATTATCA TTCCCTTTGG TCTGGCTCTC TTGCTGGAAG CCTTGCTTTT 



AAAAGGCCTG CCTCTCAAGT CATCCGTTGT AAACTCGTCG ACAGCTCTTT TGGGAATGTT 
GCCTAAGGGA ATTGCCCTTT TGACCATTAC TTCGCTCTTG ACTGCAGTGA TTAAGTTGGG 
35 CTTGAAAAAG GTCTTGGTGC AGGAGAT GTA CTCTGTTGAG ACCTTGGCGC GCGTGGATAT 
GCTCTGTCTG GACAAGAC GG GCACCATCAC CCAAGGAAAG ATGCAGGTGG AGGCTGTTCT 
TCCACTGACG GAAACTTACG GTGAAGAGGC TATTGCCAGC ATCTTGACTA GCTACATGGC 

40 

CCATAGTGAG GATAAGAATC CAACTGCCCA AGCCATTCGC CAGCGTTTGT GGGAGATGTT 
GCTTATCCT 

45 (2) INFORMATION FOR SEQ ID NO: 64: 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1089 



50 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 731 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



55 



(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 



60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
GCTAGCAATA TCATGTTTAT GCTTGATTTG GGGAATCATT TAGATCAGTG GTCCTTGAA*. 60 
5 AAAACTGCAA CAGATTTAGA ACAGAGTCTT CTTGCAAAAG AGAGCGATGT ATTCCTAGTA 12 0 

CAGGGCGATA CGGTT GTTAG TAT CAAGAGT TCCGATGTTC AAATAGGAGA TGTCTTGATC 180 
TTATCTCAAG GAAATGAAAT TCTGTTTGAT GGACAAGTAG TTTCAGGTTT AGGTATGGTC 24 0 

10 

AACGAAAGTT CCTTGACAGG AGAGAGTTTT CCAGTTGAAA AAAGAGAGTC TGATTTGGTT 300 
TGTGCAAATA CAGTATTAGA AAC T G GAG AG TTACGCATTC GTGTAACAGA TAATCAGATG 3 60 

15 AACAGCC GTA TTTTACAGCT GATTGAGTTG ATGAAGAAAT CTGAAGAAAA CAAGAAAACG 420 
AAACAACGCT ATTTCAT CAA GATGGCGGAT AAAGTCGTCA AATATAATTT CTTGGGGTCT 4 80 

GGGCTGACTT ACCTATTGAC AGGTTCTTTT TCTAAGGCTA TTTCTTTCCT ATTGGTCGAT 540 

20 

TTCTCCTGCG CTTTGAAAAT CTCTACTCCT GTAGCTTATT TGACAGTTAT CAAGGTAGGG 600 
TTGAACCGTG AAATGGTGAT TAAGGATGGA GATGTTCTGG A GAAAT AT C T GGTAGTTGAT 660 

2 5 ACTTTCTTGT TTGATAAGAC AGGACCAATC ACAACTAGTT ATCCTATAGT TGAAAAGGTG 720 

TACCCTTTGG G 7 31 

(2) INFORMATION FOR SEQ ID NO: 65: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2197 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

3 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

40 

(iv) ANTI-SENSE: NO 



45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

TATATTATTC CATTTGTGGT AAATCTGTAC ATGATAGATT AAGTACTCCG ACTGAAACCA 60 

5 0 GTACACTAAT CAAGCTATAG CCAGCTAACA AAAGGAGTAA CCATAGAATA TTAACTTTTA 12 0 

AATTTTCCTT CATCGTTTAC ACCTTCTCTT TCACATTCTT AC CAAG GAT A CCAGCTGGGC 180 

G GACAAT CAA GATCAACAAC AAG ATT C CAT AAACAATGGC AT C AC G GAAA TCTGACATCC 24 0 

55 

CAAAGGCTGT CGCAAAGGTT TCCAATAGAC CAATCACAAA GCCACCAAGA GCCGCACCAG 300 

GAATAATTCC GATACCACCA AGTACTGCGG CAACGAAAGA TTTAAGACCT GGAGTAACCC 360 

60 CCATCAAAGG CTCAAGAGAG T T ATAAT AAA GAGCAATCAG AACACCAGCC GCACCCGCAA 42 0 
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GAGCAGAACC CAAAGCGAAG GTAAAGCTAA 
GCGCCGCGTC GCTATCTACT GATACTGCAC 
5 CAATGACTTG TAACAAAATC ATCAAAATCA 
TTGTTAAGCT AATTGGTCCC AAAT CAT AT C 
GGGTATTGGC ACCAACCAGA TAGACCATTC 

10 

CCGTAATCAA AACAGCAATA CGAGTAGAGT 
TCACGACACC AAGAATAGCT GTCGCTAGCA 
15 TTTGGAAAGA ATTGATCAAG AAAT AAC C GA 
GGGCGAAGTT GATGAGCTTG ATAATTCCGT 
CGTAAACACT ACCTAGAATC AAAC CATTTA 

20 

CTATTTATAA TTCCGAGGGT TTTCCCTCAC 
TCAAAGAGCA AACTAGGAAA CTAGCCGCAG 
25 ATAAGACTGA CGAAGTCAGT CACATATATA 
GATTTTCGAA GAGTATTAAA TATCGAAACA 
CAACATTTTT CTATTATGGT TTTACAACTT 

30 

TCATCATGTA AGCAGTTTTG ACTGTGTTGT 
CACCTTCAAA ATCTTTTGTT TTAGCAAGGT 
35 CTTTTGCTGC GTTTGCTACA AGGTGAACTG 
GCT CTTCATT GTACTTAGCA CGGTAAGCGT 
CTACAGTAGT TGAGAAGCCT GAGATAAAGT 

40 

GTTGTACAAA CTCCTCACCG TTGAATCCAT 
TACCACGCGC TTGGTTTACA ATCTTACCAG 
45 CAT CAAAGTC TTTCCCTTTC ATTTTTGTAA 
CGAAAGTTTC ATCTGCAACG ATTTCACCCT 
CTTTAGCATA GTCACTGGCA TTGTCAGTGT 

50 

CAGAAACATA GTTTGAGATA ATTTTTCCTT 
GGTAATCTTG ACCTTTAGTC AATCCATCTT 
55 CTGCTTTTGT AGCGTTCGCT ACCGCAGCTG 
CTGCTGATAC TTTAGATTGG GTTACAAGGT 
CAGACTTATT ATCTTTATCG ACTACTTCGA 

60 

(2) INFORMATION FOR SEQ ID NO: 66 



-128- 

TCGTACGGTT TATATTGATC CC CATC AATT 480 

GCATGGCTTT CCCCATCTTA GTCTTTTGGA 54 0 

AGGAAATGCC C AAAAT CAT T AACTGCACAT 600 

GAACTGTTTG AATCGCTTGA GGGAAGGCAC 660 

CATACTCCAA TAG GAAAGAA ACCCCAATAG 72 0 

GGCGCAAAGG TCGGTAAGCA AGAAACTCAA 78 0 

TAGCTACAAT AAGCGCTACA AAGAAAT T CA 84 0 

TAAAGGCTCC CAT CAT AT AA ATATCACCAT 900 

AAAC CAT GGT ATATCCTAGG GCTAACAGCG 960 

CGAGTTGTTG GAGCATAAGA TTCACTCTTT 102 0 

TTTTTGATAG GTTCTTATAC TCAATGAAAA 1080 

GTTGCTCAAA GCACTGCTTT GAGGTTGTAG 1140 

ATCCAAGGCG ACGTTGACGC AGTTTGAAGA 12 00 

GGGAGTGAGT CAAAGGCTCA TTCCCTATTT 12 60 

CTGCTGCTTC AACTTTACCA TTGTTCATGG 1320 

GGT CTGCATC GAAGCTTGTT TGACCAGTTA 138 0 

TATTCTTGAT TTCACCTGAA TTTTTAGCAC 14 4 0 

AATCATAAGC CAAGGCTGCA AATGTTGAAG 1500 

CAAGGAAGGC TTTAGCTTTA GCT GAAACTT 15 60 

AGATGTTTGA TGCTTTTTCA GCAGTTGCTT 1620 

CACCACCAAC GATTGGTTTG TCAATTCCCA 168 0 

CCTCATTATA GTAACCAGGA AC AAC GAT AG 174 0 

GGGCTGCTTG GAAGTCTGTG TCACCTGCTA 1800 

TGTATGACTC GCGGAAAGAT TTGGCAATCC 18 60 

AAAGAACAAC TTTCTTAGCA TTTAATTTTT 1920 

GGAAGCTATC TTGGAAAGTT CCAATAAAGA 1980 

GAGTCGCACT TGGTGAGATC AATGGAACAC 2040 

CAGTCGCACC AGATGTCGCA GGTCCTACGA 2100 

TAGTTGTAAC TGAAGCAGCC TCAGCTGTTT 2160 

TTTGTTT 2197 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9C0 base pairs 
(E) TYPE: nucleic acid 

(C) STRANDEDKESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iil) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

TGTCCCAAGA CCAGACTTGG TATGCTCTGG CCTATGATGG GGCAGAAGTG ATTGGCTTTC 60 

TAACTGTTCA GGAGACTCTC TTTGAAGCAG AAGTCCTGCA AATCGCTGTC AAAGGAGCTT 12 0 

ATCAGGGTCA GGGAATTGCG TCAGCCTTGT TTGCTCAATT GCCGACAGAC AAGGAAATTT 180 

TCCTCGAAGT CAGACAGTCA AATCAACGAG CGCAAGCATT TTACAAGAAA GAAAAGATGG 24 0 

CAGTTATCGC TGAGCGAAAG GCCTACTACC ATGACCCAGT CGAGGACGCC AT TAT CAT GA 300 

AGAGAGAAAT AGATGAAGGA TAGATATATT TTAGCATTTG AGACATCCTG TGATGAGACC 3 60 

AGTGTCGCCG TCTTGAAAAA CGACGATGAG CTCTTGTCCA ATGTCATTGC TAGTCAAATT 420 

GAGAGTCACA AAC GTTTTGG TGGCGTAGTG CCCGAAGTAG CCAGTCGTCA CCATGTCGAG 4 80 

GTCATTACAG CCTGTATCGA GGAGGCATTG GCAGAAGCAG GGATTACCGA AGAGGACGTG 54 0 

ACAGCTGTTG CGGTTACCTA CGGACCAGGC TTGGTCGGAG CCTTGCTAGT TGGTTTGTCA 600 

GCTGCCAAGG CCTTTGCTTG GGCTCACGGA CTTCCACTGA TTCCTGTTAA TCACATGGCT 660 

GGGCACCTCA TGGCAGCTCA GAGTGTGGAG CCTTTTGGAG TTTCCCTTGC TAGCCCTTTT 720 

AGTTCAGTGG GTGGGGCACA CAGAGTTGGT CTATGTTTCT GAGGCTGGCG ATTACAAGAA 7 80 

TTGTTGGGGA AGACACGAGA CGATGCAGTT GGGGAGGCTT ATGACAAGGT CGGTCGTGTC 8 40 

ATGGCTTGAC CTATCCTGCA GGTCGTGAGA TTGACGAGCT GGCTCATCAG GGGCAGGATA 9 00 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1023 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 



5 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:67: 
CCGGCGATCT TCCGCTAGAA ATAGTCTACC AAGATGAGGA TGTGGCTGTC GTTAACAAAC 60 
10 CTCAGGGAAA TGGTTGTGCA CCCGAGTGCT GGTCATACCA GTGGAACCCT AGTAAAT GCC 120 
CTCAT GTATC ATATTAAGGA CTTGTCGGGT ATCAATGGGG TTCTGCGTCC AGGGATTGTT 180 
CACCGTATTG ATAAGGATAC GTCAGGTCTT CTCATGATTG CTAAAAACGA TGATGCGCAT 2 40 

15 

CTAGTACTTG CCCAAGAACT CAAAGATAAA AAGTCTCTCC GCAAATATTG GGCGATTGTT 300 
CATGGAAATC TGCCTAATGA TCGTGGTGTA ATTGAAGCGC CGATTGGCCG GAGTGAAAAA 3 60 

2 0 GAC CGTAAGA AACAG GCTGT AACTGCTAAA GGGAAGCCTG CAGTGACGCG TTTTCACGTC 420 

TTGGAACGCT TTGGCGATTA TAGCTTAGTA GAGTTGCAAC TGGAGACAGG GCGCACTCAT 4 80 

CAAATCCGTG TCCACATGGC TTATATCGGC CATCCAGTCG CTGGTGATGA GGTCTATGGT 540 

25 

CCTGCAAGAC TTTGAAAGGA CATGGACAAT TTCTTCATGC CAAGACTTTA GGTTTTACTC 600 
ATCCGAGAAC AGGTAAGACC TTGGAATTTA AAGCAGATAT CCCAGAGATT TTTAAGGAAA 6 60 

3 0 CCTTGGAGAG ATTGAGAAAG TAAGAAT GAA AAAGAAATTA ACTAGTTTAG CACTTGTAGG 720 

CGCTTTTTTA GGTTTGTCAT GGTATGGGAA TGTTCAGGCT CAAGAAAGTT C CAG GAAATA 7 80 

AAATCCACTT TATCAATGTT CAAGAAGGTG GCAGTGATGC GATTATTCTT GAAAGCAATG 84 0 

GACATTTTGC CATGGTGGAT ACAGGAGAAG ATTATGATTT CCCAGATGGA AGTGATTCTC 9 00 

GTTATCCATG GAGAGAAGGA ATTGAAACGT CTTATAAGCA TGTTCTAACA GACCGTGTCT 9 60 

4 0 TTCGTCGTTT GAAGGAATTG AGTGTCCAAA AACTTGATTT TATTTTGGTG ACCCATACCC 102 0 

ACA 10 23 
(2) INFORMATION FOR SEQ ID NO: 68: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 5 base pairs 
!B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 

5 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

55 

(iv) ANTI-SENSE: NO 

60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
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GCTCGGTACC CGGGGATCCT CTAGAGTCGA TAATATCAAC CTGCAGGTTG AT GAAC GAGA 6 0 

TCGGATTGCT CTTGTTGGGA AAAATGGTGC AGGTAAGTCT ACTCTTTTGA AGATTTTAGT 12 0 

TGGAGAAGAG GAGCCAACTA GCGGAGAAAT CAATAAGAAA AAAGATATTT CTCTGTCTTA 18 0 

CCTAGCCCAA GATAGCCGTT TTGAGTCTGA AAAT AC CAT C TACGATGAAA TGCTTCATGT 24 0 

CTTTAATGAT TTGCGTCGGA CGGAGAGACA ACTGCGTCAG ATGGAGCTGG AGATGGGTGA 300 

AAAGTCTGGT GAGGATTTGG ATAAACTGAT GTCAGATTAT GACCGCTTAT CTGAGAATTT 360 

TCGCCAAGCA GGTGGCTTTA CCTATGAAGC TGATATTCGA GCGATTTTGA ATGGATTCAA 42 0 

GTTTGACGAG TCTATGTGGC AGATGAAAAT TGCTGAGCTT TCTGGTGGTC AAAATACTCG 4 80 

TTTGGCACTT GCCAAAATGC TCCTTGAAAA GCCCAATCTC TTGGTCTTGG ACGAGCCAAC 54 0 

TAACCACTTG GAT ATT GAAA CCATCGCCTG GCTAGAGAAT TACTTGGTAA ACTATAGCGG 60 0 

TGCCCTCATT ATCGTCAGCC ACGACCGTTA TTTCTTGGAC AAGGTTGCGA CAATTACGCT 660 

AGATTTGACC AG CAT 67 5 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 582 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: 

TAGAGTCGAT AGCAATAGAT TGAGTAAGTT GTAGCCTTAC CAGCATCGAT AGAAACAAGG 6 0 

GCACCACGGT GACGTCCACC AATTTCCCCT GGAATCAATG GCAAGTATTG GTCGAAGGTA 12 0 

TGGTTCATGA TACCGTAACC ACGAGTCATT GACAAGAACT CAGTTGAGTA TCCAATCAAA 180 

CCACGCGCTG GAACAAGGAA GACCAAACGA GTTTGACCAT TACCAGTTGA AATCATATCC 240 

AACATTTCAC CTTTACGTTC AGAAAGGCTT TGGATAACAG ACCCTTGGTA TTCTTCTGGA 3 00 

GTGTCGATTT GTACACGTTC AAATGGTTCA CATTTAACAC CGTCGATTTC TTTTACGATA 3 60 

ACTTCTGGAC GAG AT AC T T G AAGTT CAT AG CCCTCACGAC GCATTGTTTC GAT AA G GATT 42 0 

GACAAGTGCA ATTCTCCACG TCCTGAAACA GTCCATTTAT CTGC GTGAAT CAGTTGGGTC 4 80 

AACACGAAGG AACGTCTGTT TGCAATTCTG CCTGCAAGCG TTCTTCCACC TTACGAGAAG 54 0 
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TTACCCATTT ACCTTCTTTA CCAGCAAATG GTGAGTTGTT GA 5 82 

(2) INFORMATION FOR SEQ ID NO:70: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1337 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 ( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

15 

(IV) ANTI-SENSE: NO 



20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

TTGGATTGAA GAACAAAGAT TTGGACTCTA TTGACCTTAT GGTTTGGGGG AAATTTGGAA 60 

2 5 TTTCAAAGTC GCCCAACCCC CTCATTTCTA AAGAATTGGA AGCCGGATGG GACTCTACCA 120 

AACACGTTTA ACCCAAAGAA AATTGGCAGG AAGAAATGGA AAAAATTTGA TTTTTAAAAA 180 

ATACTTAAGG AAACTTTAAG CTAGGGAGTG TACCCTAAGT TCAATAAAGT TAAAGAAGAC 240 

30 

CTTAACTTAA ACTCCTAAAA CTTTTTCAAT AATAATCTCC CTATAAAAAT AAAGTCGCCC 300 

AATCAGGCGG CTTAATTTTT TTGAAAAATG GGCTTGGTGC CTGAGAATAA ATAGCTTAGT 360 

3 5 GATAGAAGAA AATGGGGAAA TATGGTATAA TGAAACGATA GATTTTTGAA TAG GAAT AAG 420 

ATCATGTTTG GATTTTTTAA GAAAGATAAG GCTGTGGAAG TAGAGGTTCC GACACAGGTT 4 80 

CCTGCTCATA TCGGCATCAT CATGGATGGC AATGGCCGTT GGGCTAAAAA ACGTATGCAA 54 0 

40 

CCGCGAGTTT TTGGACACAA GGCGGGCATG GAAGCATTGC AAACCGTGAC CAAGGCAGCC 600 

AACAAACTGG GCGTCAAGGT TATTACGGTC TATGCTTTTT CTACGGAAAA CTGGACCCGT 660 

45 CCAGATCAGG AAGTCAAGTT TTCATGAACT TGCCAGTAGA GTTTTATGAT AATTATGTCC 72 0 

CGGAACTACA TGCGAATAAT GTTAAGATTC AAATGATTGG GGAGACAGAC CGCCTGCCTA 78 0 

AGCAAACCTT TGAAGCTTTA ACCAAGGCTG AGGAATTGAC TAAGAACAAC ACAGGATTGA 84 0 

50 

TTCTTAATTT TGCTCTTAAC TATGGTGGAC GTGCTGAGAT TACACAGGCG CTTAAGTTGA 9 00 

TTTCCCAGGA TGTTTTAGAT GCCAAAATCA ACCCAGGTGA CATCACAGAG GAAT TG ATT G 960 

55 GTAACTATCT CTTTACTCAG CATTTGCCTA AGGACTTACG AGACCCAGAC TTGATTATCC 102 0 

GTACTAGTGG AGAATTACGT TTGAGCAATT TCCTTCCATG GCAGGGAGCC TATAGT GAGC 108 0 

TTTATTTTAC GGACACCTTA TGGCCTGATT TTGACGAAGC GGCCTTGCAG GAAGCTATTC 114 0 

60 

TTGCCTATAA TCGTCGTCAT CGCCGATTTG GAGGAGTTTA GGAGGAAATA TGACCCAGGA 12 00 
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TTTACAGAAA AGAACCTTG? TTGCAGGGAT TGCCCTGGCT ATTTTCCTAC CAATTTTAAT 12 60 

GATTGGGGGC TCTTGCTTCA GATAGCAATC GGAATCGTAG CCATGCTAGC CAT G CAT GAA 1320 

CTTTTGAAGA TAAGAGG 1337 
(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 818 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

TCGGTACCCG GGGATCCTCT AGAGTCGATA GTCGCCAAGC AGAAGAAGGG AACACCATTC 60 

GTAGAAGACG TGAGTGCGAC GAATGCCAAC ACCGTTTTAC AACCTACGAA CGAGTAGAAG 120 

AAAGAACCTT AGTGGTTGTT AAAAAAGATG GCACACGGGA ACAATTCTCC AGAGATAAAA 18 0 

TCTTTAATGG GATTATCCGC TCAGCCCAGA AACGTCCTGT GTCAAGTGAT GAAAT CAACA 2 40 

TGGTGATCCT CTAGAGTCGA ACAGAAACTC CGTGGTCGAA ATGAAAATGA AATTCAAAGT 300 

GAG GACATT G GTTCACTCGT CAT G GAGGAG TTGGCTGAAT TGGACGAGAT TACCTATGTA 360 

CGTTTTGCTA GTGTCTATCG TAGTTTTAAG GAT GT CAGT G AGTTAGAGAG CTTGCTCCAA 42 0 

CAAATCACCC AGTCCTCTAA AAAGAAAAAG GAAAGATAAA TGAAGC CAAT TGACCGTTTT 480 

TCTTATCTAA AGAATAATCG GGTGTCGCAA GATACCTCAT CTCTGGTACA GTGCTACCTC 54 0 

CCGATTATCG GTCAGGAGGC ACTGAGCCTT TATCTTTATA CGATTAGTTT TTGGGATAAT 600 

GGTAGAAAGG AATATCTTTT TTCAAGTATC CTCAATCATC TTAACTTTGG AAT GGATAGA 660 

CTGATAAAAT CATTGAAAAT CTTATCTGCT TTTAATCTCT TGACTCTCTA TCAAAAGGGG 720 

GATGTTTATC AGCTAGCCCT CCATGCTCCT CTATCTAGTC AAGACTTCTT GGGGCATCCT 7 80 

GTTTATCGCA GACTCTTAGA GAAAAAGATT GGGGACAA 818 
(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 746 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 

TTACCCGGGG ATCCTCTAGA GTCGATATGC TCTCTGAGGG TCAATTCCTC ATACAGACTA 60 

GGCGTCTCAG GAATGTAGCC AATCTGCTTG CGGTAGCTAG TCGCATCTCC TTGCAGAGTC 12 0 

AGGCCATTGA TATTGATGGA GCCACTATAA GGTGCCAACA GACCGATAAT CTCATTGATC 180 

GTCGTTGATT TCCCAGCACC ATTGAGACCA ATCAAACCGA CCAACTGCCC ACTTTCAACA 24 0 

GTAAAG GAC A CAT CTTT CAA AACAGGAACA T GAACATAG C CACCTGTCAG GTTTTTAATT 300 

TCTAACATAT TTTCTCCAAA TCTGGTATAA TGTAGCTATA TTATATCAAA ATTCAGTACA 360 

GTAGAGGTAG ATTTTATGTC AGATTGCATT TTTTGTAAAA TCATCGCAGG GGAAATTCCT 42 0 

GCTTCGAAAG TATATGAAGA TGAGCAGGTC CTTGCCTTTC TTGATATCTC TCAAGTAACA 4 80 

CTAGGACACA CCTTGGTCGT GCCAAAAGAA CACTATCGCA ATCTTTTGGA GATGGATGCT 540 

ACGAGCGCCA CCAACTCTTT GCCCAAGTAC CAAAAGT AG C TCAAAAAGTC AT GAAAGTCA 600 

CTAAGGCTGC TGGTATGAAT ATCATTTCCA ACTGTGAAGA AGTCGCTGGT CAAACAGTTT 660 

TTCATACTCA CGTTCACCTT GTGCCTCGCT ACAGTGCTGA CGATGACCTC AAGATTGATT 72 0 

TTATCGCCCA CGAAACAGAC TTTGAC 746 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 767 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANT I- SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
GATCCAAGCA GTCCGTGATG TAAGCTTTGA AGTTAATGAA GGAGAAGTTG TTTCCCTTAT 
CGGTGCCAAC GGTGCAGGTA AGACAACTAT TCTTCGCACC TTGTCAGGTT TGGTTCGACC 
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AAGTTCAGGA AAGATTGAAT TTTTAGGTCA AGAAATCCAA AAAATGCCAG CTCAGAAAAT 18 0 

TGTGGCAGGT GGTC7TTCAC AAGTTCCAGA AGGACGCCAC GTCTTTCCTG GCTTGACTGT 240 

TATGGAAAAT CTTGAAATGG GAGCTTTCTT AAAGAAAAAT CGTGAAGAAA ATCAAGCTAA 3 00 

CTTGAAGAAG GTTTTCTCAC GCTTTCCTCG TCTTGAAGAA CGTAAGAACC AAGATGCAGC 360 

TACTCTTTCA GGAGGGGAAC AACAAATGCT TGCCATGGGA CGCGCTCTTA TGT CAACACC 420 

AAAACTTCTT CTTTTAGATG AACCATCAAT GGGACTTGCC CCAATCTTCA TCCAAGAGAT 4 80 

TTTTGATATC AT T CAAGAT A TTCAGAAGCA AGGAACAACC GTCCTCTTGA TTGAACAAAA 540 

TGCCAATAAA GCACTTGCAA TCTCTGACCG AGGATATGTA CTGGAACAGG GAAATCGTCT 600 

ATCAGGGACA GGGAAAGACT CGCT CATCAG AGGAGTCAGA GCATATCTAG GTGGTAAACA 660 

TCCAGTGGAT TTTTGTCGGC AGTGAGTTCG GGATCATCAT TTAGTTGGGG CTTGTTAGGT 720 

TCAGTAAGTC GGTTATCAAA TCAGGGTTGT TTGCCGCAGT GGGGTCG 7 67 
(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 695 base pairs 

( B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANT I- SENSE: NO 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 74: 

GAGCTCGGTA CCCGGGGATC CTCTAGAGTC GATAATTCGT TGGTTGGACG AACCTCGAAA 60 

CTGGAGCATG AGATTTCTCT TAGTTCGATC ATATCTTCCA TCGACAAGAA TGTCAATCAA 12 0 

T GAT AAGAGT TCCAGTTTAT CTGGAGTTTC CGGGATCATT TCTTCCCAAG TGTAGCCCGT 18 0 

CCAGGACCAA ATGTCCTTGT CTGGCAATTC CTTTCGGATG CGTTTAACTA GAGGCAAGAG 24 0 

AATGCCAGTA TTGAGAAAAG GCTCCCCTCC CAGCAAAGTC AAGCCTTGAA CATAGGGTTG 300 

GGCAAGGTCT GCCATAATCT GCTCTTCTAA TTCTGCTGTA TAGGGAATGC CAGCATTAAA 360 

AGACCAAGTC GCAACATTAT AACATCCCTC GCAGTGAAAC ATACAGCCTG ATACATAGAG 42 0 

AGAGTTGCGC ACGCCTTCGC CGTCCACAAA GTTAAAGGCC TTGTAGTCAA TGATACGACC 480 

TTGACTAAGT TCCTCGCTTT TCCATTCTTG TGGTTTTGGA TTATTCATTC GCTACCTCTA 54 0 

TCCAATAACG CTCGACTCCA TTGCGAGCAT CCTCAAATAT TCCACCATTT GCTAGAATGA 600 
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CTGCTCTGCT AGCAGGATTA TTCACGCTAC AGGGCACCAG AGCTTTCTTG ATGTCTTTTC 
CCTAGCAACT TCAAGCCCTG ACGGAAGTCT TTT7T 
5 (2) INFORMATION FOR SEQ ID NO:75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 723 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
CTCGGTACCC GGGGATCCTC TAGAGTCGAC GGCTACAATG AT AT T AAGAT GGATGATGTG 

25 

ATTGACGCGT ATGTCATGGA AGAAATCAAG AGATAAGATT TTTTGCTCCT TTCTTAGGTG 
GTGAGGGACG CAAGCAAACC GATGGTTTCA TTGCTTATTT TT GAG C CT AG GGTCTCAAAA 

3 0 ATCCCCTGTG ATGGGACTGA TAAATCAGTT CCATCACTTT CACCACGGCG AAAGAAGCAG 

ATGACTTCAA ATTGAACTTC GTTTCAATTT AAACTGAAAA TCAAGAAGTT TAAAATAGCT 
AGGTCTGCTG GCCTAGCTTT TGGTTCAAAG TAGAGAAAGG AATATCATGG TAAATCATTT 
CCGTATAGAT CGTGTGGGCA TGGAAATCAA GCGTGAAGTC AATGAGATTT TGCAAAAGAA 
AGTCCGTGAT CCACGTGTCC AAGGTGTGAC CATCACAGAT GTTCAGATGC TGGGTGACTT 

4 0 GTCTGTTGCC AAGGTTTATT ACACCATTTT GAGTAACCTT GCTTCGGATA ACCAAAAAGC 

CCAAATCGGG CTTGAAAAAG CAACTGGTAC CATCAAACGT GAACTTGGTC GCAATTTGAA 
ATTGTACAAA TCCCAGATTT GACCTTCGTC AAAGACGAGT CCATCGAGAT GGAACCAAGA 

45 

TTGACGAGAT GCTACGAAAT CTGGATAAGA CTAAAGAAGA GGGGGTTGCC CCCCTTTTTT 
GGG 

50 (2) INFORMATION FOR SEQ ID NO:76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 970 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
60 (iii) HYPOTHETICAL: NO 
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ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

TGTCCTTATT TGTCTGACCA AGTGCAAGCT GGTCGGATTT GTGGTAACAT TGGATAAGAT 60 

TTGACAAAGG AATTTCCATC ATGTAACGGT CTTACTCCAC GAAACGATTG ATATGCTTGA 12 0 

CGTAAAGCCT GAAGGTATCT ACGTTGATGC GACTTTGGGC GGAGCAGGAC ATAGC GAGTA 180 

TTTATTAAGT AAATTAAGTG AAAAAGGCCA TCTCTATGCC TTTGACCAGG ATCAGAATGC 24 0 

CATTGACAAT GCGCAAAAAC GCTTGGCACC TTACATTGAG AAGGGAATGG TGACCTTTAT 3 00 

CAAGGATAAC TTCCGTCATT TACAGGCACG TTTGCGCGAA GCTGGTGTTC AG GAAATT GA 360 

TGGAATTTGT TATGACTTGG GAGTGTCTAG TCCTCAATTG GACCAGCGTG AGCGTGGTTT 42 0 

TTCTTATAAA AAGGATGCGC CACTGGACAT GCGGATGAAT CAGGAT GCTA GTCTGACAGC 4 80 

CTATGAAGTG GTTAATCATT ATGACTATCA TGATTTGGTT CGTATTTTCT TCAAATACGG 54 0 

T GAG GAT AAA TTCTCTAAAC AGATTGCGCG TAAGATTGAG CAAGCGCGTG AAGTGAAGCC 60 0 

GATTGAGACA AC GACT GAGT TAGCAGAGAT TAT CAAGTT G GTCAAACCTG CCAAGGAACT 6 60 

CAAGAAGAAG GGTCATCCTG CTAAGCAGAT TTTCCAGGCT ATTCGAATTG AAGTCAATGA 72 0 

TGAACTGGGA GCGGCAGATG AGTCCATCCA GCAGGCTATG GATATGTTGG CTCTGGATGG 7 80 

TAGAATTTCA GTGATTACCT TTCATTCCTT AGAAGACCGC TTGACCAAGC AATTGTTCAA 84 0 

GGAGCTTCAA CAGTTGAAGT TCCAAAAGGC TTGCTTTCAT C CCA GAT GAT CTCAAGCCCA 900 

AGATGGAATT GGTGTCCCGT AAGCCAATCT TGCCAAGTGC GGAAGAGTTA GAAGCCAATA 960 

ACCGTTGACT 97 0 
(2) INFORMATION FOR SEQ ID NO:77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 954 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 

GAAAGGA.GTA ACTGATGCAC GTAACAGTAG GTGAATTAAT TGGTAATTTT ATTTTAATCA 60 
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CTGGCTCTTT TATTCTTTTG CTAGTCTTGA TTAAAAAAT? TGCATGGTCT AATATTACAG 120 

GCATTTTCGA AGAAAGAGCT GA&AAAATTG CTTCAGATAT TGACAGAGCT GAAGAAGCCC 180 

5 

GTCAAAAAGC AGAAGTATTG GCTCAAAAAC GCGAAGATGA ATTGGCTGGT AGCCGTAAAG 240 

AAGCTAAGAC AATCATTGAA AATGCAAAGG AAACAGCTGA GCAAAGTAAG GCTAATATCT 3 00 

10 TAGCAGATGC TAAACTAGAA GCGGGACACT TAAAAGAAAA AGCCAATCAA GAAATTGCTC 360 

AAAATAAAGT AGAAGCTTTA CAGAGTGTTA AGGGTGAGGT CGCAGATTTG ACCATCAGCT 420 

TAGCTGGTAA AATCATCTCA CAAAACCTTG ACAGTCATGC CCATAAAGCA CTCATTGATC 480 

15 

AGTATATCGA TCAGCTAGGA GAAGCTTAAT GGACAAGAAA ACAGTAAAGG TAATTGAAAA 54 0 

AT AC AG CAT G CCTTTTGTCC AATT GGTACT T GAAAAAGGA GAAGAAGACC GTATCTTTTC 600 

2 0 AGACTTGACT CAAATCAAGC AAGTTGTTGA AAAAACAGGT CTGCCTTCTT TTTTAAAACA 660 

AGTGGCAGTA GACGAGTCGG ATAAGGAAAA AACAATTGCT TTTTTCCAAG ATTCTGTGTC 720 

ACCTTTATTA CAAAACTTTA TCCAGGTTCT GGCCTACAAT CACAGAGCAA ATCTTTTTTA 78 0 

25 

TGATGTGCTT GTAGATTGCT TGAACCGACT TGAAAAAGAA ACAAATCGAT TTGAAGTGAC 840 

GATTACGTCT GCTCATCCTC TAACTGATGA ACAGAAGACT CGTTTGCTCC CTTTGATTGA 9 00 

3 0 GAAAAAAATG TCTCTGAAAG TAAGGAGTGT AAAAGAACAA ATCGATGAAA GTCT 954 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1602 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

4 0 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

50 

CCTGATTATA CCCAACCTCT TTGCATCAAG TCGGAAAAAT GAGTGAAATG GGTTTCCAGT 60 

TTTCCTGAAA TAAGGTATCC TATATAAAGT ACCCTATGAT AAC CAT GGAG GTATTGTGTA 12 0 

5 5 TGGTTCAAAC AAGTCATTGA AGAAATACAA AACAATGCCA ACATTGTGGA AGTCATAGGA 18 0 

GAT GT GAT AT CTTACAAAAG GCAGGACGGA ACTATCTAGG GCTCTGTCCT TTTCATGGTG 240 

AAAAAACACC ATCTTTCAGC GTTGTAGAGA ACAAGCAGTT TTACCACTGT TTTGGTTGTG 300 

60 

GTCGCTCAGG TGATGTCTTT AAAATTCATC GAGGAGTACC AAGGGGTTAC CTTTATGGAG 360 
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GCTGTCCAAA TCTTAGGTCA GCGTGTCGGG ATTGAGGTTG AAAAACCGCT TTATAGTGAA 42 0 

CAGAAGCCAG CCTCGCCTCA CCAAGCTCTT TAT GAT A? GC ACGAAGATGC GGCTAAATTT 4 80 

5 

TACCATGCTA TTCTCATGAC AACGACTATG GGCGAAGAGG CCAGAAATTA CCTTTATCAG 54 0 

CGGGGTTTGA CAGATGAAGT GCTTAAACAT TTTTGGATTG GTTTAGCACC TCCAGAACGA 600 

10 AACTATCTCT ATCAACGTTT GTCTGATCAG TATCGTGAAG AGGATTTACT GGATTCAGGC 660 

CTGTTTTATC TTTCGGATGC CAATCAATTT GTAGACACCT TTCACAATCG CATTATGTTT 72 0 

CCCCTGACAA ATGACCAAGG AAAGGTCATT GCCTTCTCAG GTCGTATCTG GCAAAAAACG 780 

15 

GAT TC AC AAA CTTCTAAGTA T AAAAAC AG C CGTTCGACTG TAATTTTTAA CAAAAGTTAC 840 

GAAT TAT AT C AT AT G GAT AG GGCAAAAAGA TCTTCTGGAA AAGCTAGTGA GATTTACCTG 900 

2 0 AT GGAAGGAT TCATGGATGT TATTGCAGCC TATCGGGCTG GAATCGAAAA TGCTGTGGCG 960 

TCGATGGGAA CGGCCTTGAG TCGAGAGCAT GTTGAGCATC TGAAAAGGTT AAC CAAGAAA 1020 

TTGGTTCTTG TTTACGATGG AGATAAGGCT GGGCAAGCCG CGACATTGAA AGCATTGGAT 1080 

25 

GAAATTGGTG ATATGCCTGT GCAAATCGTC AGCATGCCTG ATAACTTGGA TCCTGATGAA 114 0 

TATCTACAAA AAAATGGTCC AGAAGACTTG GCCTATCTAT TAACGAAAAC TCGTATTAGT 1200 

30 CCGATTGAGT TCTACATTCA TCAGTACAAA CCTGAAAACG GTGAAAATCT GCAGGCTCAG 12 60 

ATT GAGTTT C TTGAAAAAAT AGCTCCCTTG ATTGTTCAAG AAAAGTCCAT CGCTGCTCAA 1320 

AACAGCTATA TTCATATTTT AGCT GACAGT CTGGCGTCCT TTGATTATAC CCAGATTGAG 138 0 

35 

CAGATTGTTA ATGAGAGTCG TCAGGTGCAA AGGCAGAATC GCATGGAAAG AATTTCCAGA 14 40 

CCGACGCCAA TCACCATGCC TGTCACCAAG CAGTTATCGG CTATTATGAG GGCAGAAGCC 1500 

40 CATCTACTCT AT CG GAT GAT GGAATCCCCT CTCGTTTTGA ACGATTACCG TTTGCGAGAA 15 60 

GACTTTGCAT TTGCTACACC TGAATTTCAG GTCTTACATG AC 1602 
(2) INFORMATION FOR SEQ ID NO: 79: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7203 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 
50 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

55 

(iv) ANTI-SENSE: NO 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
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CCTCCATCAA ATCTGAGACT GATTCAAAAG 
GCGTTAACAC TTGGAGCAAC TTCCGATTTT 

5 

GATATTGAAT TGCCTGACGA AAG GAAAAT A 
TTACTAAACA CCTGTTGGGC CTGCTCCTTA 
10 ATCTTTGACA TACTCTCACC CTCTTTCCAT 
TAAATCTATT GGATATACGT TAGCCTCTTC 
ACT CAT CAAA CAAACTCAAC TGGTTATCCT 

15 

CCATCTTTTC AACCAAGGTT GAT GAGA GT C 
GGAATTCTCC CTCTTCACGC GCCCGCACCA 

2 0 CCATTGCTAC AAATGGTGGG ATAAGGGTAT 

TACGGTAGAG ATCTAATTTG CCAAACTTGA 
CAAGAGTTGT ATAGAGATCG ATTTCCACAT 

25 

AGATTTCTTC CATTCTGCGC TTGATGGCCT 
AAGCCTTAGC ACGAATGGAG AAGTAAGCAC 

3 0 AGTAAGCTAC ACGCAAGGCC AT CATAAC GT 

ACTTAATTTT CCCACAGGAT TCGATATACC 
CGATATAGCC ATTTCTCTCC TCTTCTGAAA 

35 

CCATAATGGT AAAGGCCATC TTAGGTTCCA 
CGTCCCGACA ACCGATAACA GT C GATAGGT 

4 0 CATTCCCCAA CCAAACGTCA GTACCGTGGG 

AGGTTGTCGG ATGGGTTTCG TCTACCATTC 
TCCCTAACAT ACCCGTAGCG TTCCAATTTG 

45 

AGAAAAGAGT GCCATCACGC CTTCGTCATC 
CAAATCCTGA AGTTTTCGAA TCATGGTCGG 

5 0 GACGTTCTCA TCGATATCGT GGAAGTTAAA 

ATCTGCTGGA TACTGGACAG GCGTAAAATC 
GATTCCCCCC GGGTGTTGGC CTGTTGTCCG 

55 

TTCTACTTCT GCATCACGAT AAAACTTGCC 
ATAGGCAGTC TTGGCAGCTA CCGTACCAAC 
60 AAAGATATCA CGCACATCCA AGTGGGCGCT 



-140- 

ACTGGCTCAT ATTACGATTT TGGTCTAAAT 60 

CGTCTAGTCT AACATCAAAA GGTAATCCCT 12 0 

TTAATAGCTG TTGTCATATC CATTCCCAGA 18 0 

ACCTCACTAT CCAGACGGAT GCTCATACTC 24 0 

AGACTATTTT AACAAAAAAG AAAGCTAATG 3 00 

TAATAGATTA TTAAGCAATT TTTTAAAACA 360 

CTGGCATATT TCCAAGAATA CCCATCTCAT 42 0 

CACCACGCTT GCGTAGTTCT GTTTTAGAGA 4 80 

GTTGCTTGGC AACGTTCTCT CCCAGACCAT 540 

CCCCGTCGAT GAGGAACTCT GTCGCCTGAC 600 

AACCTCGTTC CCACATCTCA TTGACAATCT 660 

TAGAGGCTTC ATTGTTCTTC CGTTTTTCAG 72 0 

CCAAGCCCGC ACCCATGGTC TTGATATCAA 780 

AGTAGTAATA AATAGGATGG TGAACCTTGA 84 0 

AGGCTGCCGC ATGGGCCTTA G GGAACAT GT 900 

ACTCTGGCAC CTTATTAGCC TTCATGGCTT 960 

TCTTTAGCCA CAAACCCTTA CGTACCCGTT 102 0 

GACCCGCATG CATGAGGTAA AC CAT GAT GT 108 0 

CCGCTATTCC TTGCTTAATC AGATCCTGAG 1140 

ACAGACCAGA CAGCT GAAGC AATTCCGCAA 120 0 

CACGTACGAA ATTTGTTCCA AACTCTGGAA 1260 

TTCAGGTGTT ACCCCTAGCA CATCAGTCCC 132 0 

CAT AG GAATT T TAT TAG GGT CAATACCAGA 138 0 

ATCATCATGT CCCAGTACAT CGAGTTTGAG 144 0 

GTGAGTGGTC TGCCATTCAG CCGTGACATC 150 0 

GTAGACATCC ATGTAGTTCG GAATAACAAC 1560 

CTTGACACCC GCCGCTCCTT GAGCGAGGCG 162 0 

ATAATCTCGC TCGTAACCCT TGACAAATCC 168 0 

TC-TTCCCGCA CGGAAGGCAT ATTCTTCACC 1740 

AGGCTGATCT TCTCCCGAGA AGTTCAAGTC 18 00 
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AATATCAGGA ACCTTATCCC CAT CAAAAC C 
GTTTTTACTG AGTTTGTGAC CACAG7TTGG 
5 ACCGTACGAA CCATCTGTGA TAAACTCACT 
AGAGAGAGGA TTGACCTCCG TAATCCCAAT 
CCCACGAGAA CCAACCAAAT AACCCCGTTC 

10 

CAGATAAATC ACAGCAAATC CATTCCCCAG 
AT CAACAAT A TCTGGCAGCG GATTTCCATA 
15 AGCAACTGTT TCTTCAGCCT TGTCGATGAA 
AACGGGTTCA AATATTTCTG CCAAGGCATT 
TTCCTCTCCC AAAAAGGCAA AT T CAT C C AA 

20 

TGGAAGTGGT GCTGGTTGGG CATGTTCACC 
CTGTCCCAAA CTACGGACGA TAATTTCACG 

2 5 AACATTTCCC GTAGCCAAAA CAGGCTTGCC 

GATAATGGTC TGGAGTTCCT CCATATCCTT 
GATAGCCGGT GGCATGACCT C GATAAAGT C 

30 

CACACCTTGA GAAACGACCA C GT CAAAAAC 
CAAGCCCTCT CGATGGGCAT CTAGAACCGT 

3 5 CTTGGTATTA GACAAGGAAA CCAGCTTAAA 

ATAGATGGTC GCATGCTTGA TCCGAGCTTT 
GTTGAGTCTA GCTAAATCGG T CACAC CAT G 

40 

GCAGACGACC AGTCGCTTCC GCATCGTATT 
AAACGCTTGG TCAAAGGCCC AAACCATGAT 
45 CCAGGTATCA ATAACGGCTG ACTAATCTTT 
AGCCAACGTC AAAGGTAGCA TTGTGGGCAA 
TTCTTGCAAA ACTTGTTCTA GTGGTTTGGC 

50 

TAACTCTGTA GTAAAAGCTG ACAAGGGATG 
AATAACATTC CCCTTGTACA TCTTAGAGGC 
5 5 TGAAAGTCCC GTCGTTTCCA CGTCAAAGAC 
CATTCGTTAT AGACGATAGG GACACGGTCC 
AGCTGGATTC CCGCTTTCTT AGCCGCCTTA 

60 

TGGTCTGTGA TAGCAACCGC CTTGTGTCCC 



AAGGAAGGTC T C AAAC G GAA TATCCTGTCC I860 

ACAGTCCTTA TGGGGCATAT CAAATCCTGA 192 0 

GTACTGACAC TGACCACAGA CAT AGT GAGG 198 0 

CATGGTCGCA ACGAAACTAG ATCCGACAGA 2 04 0 

ATTAGAACGT TGCACCAGCA TCTGCGATGC 2100 

TATGGATGTT AATTCTTTTT CAATCCGCAA 2160 

AATCTCAAAA GCTTTCTTAT AGGTCAACTC 222 0 

AGGCGTATAC AAGTCACCCT TAACGACTTC 22 8 0 

GGTGTTTTCA ATAACCAGTT TACGAGCCAG 234 0 

CATCTCATTA GTCGTTCGAA AATGAGCCTT 24 00 

ATGACCGATA GTTCGGTTAA TCATCGCACC 2460 

ATAAATCTCT TCTTCCGGTT C GAT AT AGT G 2520 

AAGGCGGTCT CCAACCTCTA TCAAACTCTT 25 BO 

GACCTGCTCT TTAGCAATCA AGGGCGCATA 2640 

ATAATACTTG GCCACCTCAA CCGCCGCATC 2700 

TTCACCCTCT GAACAGGCTG AACCTAAAAT 27 60 

TCTCGGAATC CGTGACACTC CTTCAAAATA 282 0 

GATATTTTTT AGACCTACCT GATTCTTGAC 2880 

TTTGTAAGAA TCTGGACTGA TTAGATCAAT 294 0 

TTTTTCTGCT ACCTCTTTGA TAAAGATAAA 3000 

GGCCATGTGG TGATGTTCCA AGCCACACCA 3060 

TTATACTCAG GATAGAGGTT TCAGCAAACT 312 0 

GGCCATGACG CTCATAATTA GCATTCATAA 3180 

CTAGGACCGT ATCCTTGCAA ATTCTTGGAA 32 4 0 

ATTTTTGACA TGATCATCTG TAATTCCAGT 3300 

CCCAGGATTG ATAAATTCAT CAAATTCAGC 3 360 

CGCAACCTGA ATCAAGTCAT TATAGATAGC 3420 

CACGTAGGTT GCTTCTGATA AGTCCATCTA 34 8 0 

TCCACGATAT TGGCTTCTAT CCCATAGATC 354 0 

TAGCCATGTG GAAAGGACTG GACATTCCCA 3600 

CACTTAGCAG CTGTTGCAAC AATCTCTTCG 3660 
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ACCTCTGGCA AAGCATCCAT AGTCGACATG 
TCACCTTCTG GCATCAAATC CTTCCGCTCA 

5 

TTCATAGTCA AATCGCGTGT GAAGTTATTC 
GAATTCTTCT TGATGAGGTC AAACTTCTGG 
10 ATAGAAAAAC TTGAAGTATA GTCCGTCATT 
GTCACTTTTT GCTCCACATC AAAAACAACC 
GTCACTTCGA TCATAGGAGT AATCTCCGCC 

15 

TTCGCTTGAA AATCAAAGAC TGGTTTCTCT 
TGTTCCATAG CACGGAGCGC TTCCTCATTG 
2 0 TGAAAGGCCT CTTCCTGCTC TTGGGTCAGG 
GTTGGAAAAC CAAACTTTTC AAGTTGTTTG 
TGTTCCTTAT CAATCGCTTC AGATCCTTCA 

25 

TGCAAATTTT GATAAAGGGA CTTAAAACCT 
TCCCTATAGT AGGACTGCAA GAG CT GAT T T 
30 AAAACAGCTT TATTGCCTGT CTTAGAAAAT 
AAGATTTCAA TCGGTAAAAT ATTAGAAAAT 
TTATGAACCA CAACTCGCTC AATATTGGCC 

35 

GCAGGCATCC CCAATTGATT CATCAAAATT 
ATTATTCTTC TACTATTTTA CCATATTTAG 
40 TAAGTGACTT CCTTCTAGAG TGAGGACGGA 
TTCTTCTTGT ACTGACTTAG GTACATCTTC 
ACGTCCTTGA GATGCAGAAC GAAGAACTGT 

45 

GTAAGCACGA ACGATTTGGC TGTTACCGTG 
AGCAGTTACG TGACCCATAA CAT CAC CAAG 
50 CATCATTGGT TCAAGGATAG CTGGTTGTGC 
AGCCGCAATC TTGAAGGCAG TTTCAGATGA 
CTTAGCTTTA ACGTCAACCA TTGGGTAACC 

55 

CAAACCTTTT TCAACCGCTG GGATAAATTC 
TTCGAATTCG AATCCTTTAC CTTCTTCGTT 
60 TTGACCTTTA CCACCAGACT GACGTTTGAA 



-142- 

T T AGTAT GAG CATGAAACTC AACCCGACGC 372 0 

TAGTGAACAA CTTCCTGCAG ATCCTGTACG 3780 

AT CT CCA CAT TCCCTCGAAC TCGGAGCCAA 3 84 0 

GCCTCTTCCT CGTTTTTAAC CCACTTTTGC 3 90 0 

TTAAAGTTGA TTAAAACACG ACCTGTTCTA 3 960 

CCTTCAAATA CCAGACGATT TTCCTCTGTC 4 02 0 

TTATCCAGCT TGGGTTTAGC TGCAGCTTTT 4 08 0 

TCCGCTGGAG GAGGTGCCAT CTGCTCCAGT 414 0 

GCAGCTTGAA CAATCTGCTC ATTTTCAGCA 42 00 

ACATCATTCT TCTCGACTTG ACAGTTAAAA 42 60 

GCTAAATTAG GAAGATGATT CTTCTTAAAA 4 32 0 

ATAAATAGCT GATTACCCTC AGCACGAACT 4 380 

TGACTAGCAC ATGGACCTTC AGAGAAAGCC 44 40 

GAAAATTCTT GAGACCGAGC CTTAATTTCA 4 50 0 

TCTTCGCTCA AACCTTTCTT TAATTCTAAA 4 5 60 

ACGAAATGAA ACTCCCATAC CTTACTAATT 4 620 

TGTGCTAAAG CAGGAGCCTG TCTCATTTCA 4 680 

TCAAAACTAT TTGACATTCA TTTTCCTCAC 474 0 

AGGTATTTTC TAAAGACAAA AGGAAGCCAC 4 8 00 

TTAGTCTTCA CCTTTATTTT TCTTAATAAT 4 8 60 

GTAGTGGTCA AAT AC CAT CA TGAATGTACC 4 92 0 

TGCGTAACCG AACATTTCAG CAAGTGGAAC 4 9 80 

TGCTTCCATA CCATCTACAC GTCCACGACG 504 0 

GTTTTCTTCT GGAACAGTGA TTGTTACAAG 5100 

TGATTTAGCA GCTTCTTTAA GGGAAAGTGA 5160 

GTCGACATCG TGATATGAAC CATCATAAAG 522 0 

TGCAAGAACA CCGTTAGCCA TAGATTCTAC 52 80 

ACGAGGAACC ACACCACCGA CGATTGCGTT 5 34 0 

TGGAGTAAAT TCAATCCATA CATCACCGAA 54 00 

GAATCCGCGT GCTTGAGTAG AAGCGCGGAA 54 60 



WO 98/26072 



PCT7US97/22578 



TG1TTCACGG TAAGATACTT GAGGAGCACC 
CAT AC GAT CA ACAAGGACGT CAAGGTGAAG 
5 AGTTTCAACG TTTGTTTCAA CGCGGAATGT 
GATACCCATC TTGTC7TGGT CAGCTTTAGA 
TTCTGGAACG TTGATTGACT CAAGGATGAT 

10 

AGTTGTAGTA TCTTTCAAAC CAACGGCAGC 
TTCTTGACGG CTGTTAGCGT GCATTT GAAG 
15 AGAAGTATTC AATACGTATG AACCTGATTG 
CAAACGACCT ACGAATGGGT CAGT CAT GAT 
GTCAGATGCT GGACGAATTT CTTCAGCGTC 

20 

GATGTCAAGT GGACTTGGAA GGTAGTCGAT 
GTTTTTGAAG GCTGAACCAC ACAATACTGG 

2 5 GATACCAGCT TTCAATTCTT CGTTAGTGAT 

TTCTTCGTCA GTTTCAGCAA CTGCTTCAAT 
AAGGTATTCA GCTGGGATGT CTTCTTCAAG 

30 

TTCAGCTTTC ATCTTGATCA AGTCAATGAT 
CAATTGGATT GGGTGTGCAT TTGCTTGAAG 

3 5 GAAGTCAGCA CCGATTTTGT CCATTTTGTT 

AGTTGCTTGA CGCCAAACTG TTTCAGTTTG 
GGTAACCGCA CCATCCAATA CACGAAGAGA 

40 

TCCTGGTGTG TCGATGATGT TTACGCGGTG 
TGTGATCGTG ATACCACGTT CTTGCTCTTG 
45 GTGAGTTTCA CCGATTTTGT GGATTTTACC 
TGTTTTACCG GCATCGACGT GAGCCATGAT 
TTCGCGTGCC ATGAGGTTTG TTTCTCCTAT 

50 

GATTTTAATA AAAAC GGATA GGCAGGACCT 
GGTTTCAACT TACGAGATGG TAAGCTGAGT 
5 5 TGAACCCGGG CTAAAGTTAG TTAGCCGATA 
ATAAAACTCC TTGTATTCAT CGAATACTGC 
ACC 

60 

(2) INFORMATION FOR SEQ ID NO: 80 



-143- 

TACGTTCGCT TCAACTTTGA ACTCACGACG 552 0 

TTCACCCATA CCTGAGATAA CTGTTTCACC 5580 

TGGATCTTCT TCAGCCAATT TTTGAAGGGC 5640 

TTTTGGCTCA ACCATCAAT7 GGATAACTGG 5700 

TTTAGCTTTT TCATCTGTCA ATGAGTCACC 57 60 

AG C GAT AT CA CCTGAGTAAA CAGTGTCGAT 582 0 

GATACGTCCG ATACGTTCAC GTTTACCTTT 5 880 

AAGAACACCT GAGTAAACAC GGAAGAATGT 5940 

CTTGAAGGCA AGAGCTGCAA ATGGCTCTTC 6000 

TGTATCTGGG TTAATACCTT TGATTGCTGG 6060 

AACCGCATCA AGCATCAATT GAACACCTTT 612 0 

GAAGAATTCA ACGTTGATAG TCGCTTTACG 618 0 

TTCTTCACCT TCGAGGTATT TCATCATCAA 624 0 

CAATTTTTCA CGGTATTCTT GAGCTTGGTC 6300 

GATATCCGTA CCAAGGTCGT TAGTATAGAT 6360 

AC CAC G GAAG TCATCTTCAG AACCGATTGG 642 0 

ACGATCGTGA AGTGTGCTTA CAGAGTAAAG 64 80 

GGCAAATACG ATACGTGGAA CTCCGTACTC 654 0 

AGGCTCAACA CCTGATTGTG AG T CAAGAAC 6600 

ACGTTGTACT TCGATTGTGA AGTCCACGTG 6660 

GTTGTTCCAT TGAGCTGTTG TCGCAGCAGA 672 0 

CTCCATCCAG TCCATTTGTG ACGCACCTTC 678 0 

AGT GTAGTAA AGAATAC GCT CAGTAGTTGT 68 4 0 

AC CGATATTA CGAGTTTTTT CAAGTGAAAA 6900 

TTATTTTTGA TTTCTATTCT ATTATAACAC 6960 

ACCCGTTCTC AATGTTTTCA TGCTATTGTT 7 02 0 

TATAGCTAAT ACTAATCGAT TTAGCTAATT 7080 

TGAGCTGGAA CGGGATGCTG CGCGAAAAAG 714 0 

GTCAGTTTCC TATTTTCACC TTGCATCCTT 7200 
7203 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1581 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

GCATACAAAG GGCAT CAAGA ATATCCGTGT TGCCACAAGC TGCGAGAAAG ATTTATGCCT 60 

ACCGCCGTTA TGACCTTAAT GAATCTCCAA AGACCGCTTT AGACCTTATC ATCCCAGATT 120 

TGTTTTTGCA TATTTTGAAC CCTGCTGAAC GTGAAAGAAA ATTAAAGCGC GAAGGTGTAG 180 

AAGAATTATA TCTCCTTGAT TTTAGTAGTC AATTCGCTAG TCTCACTGCA CAAGAATTCT 240 

TTGCAACTTA TATCAAGGCT ATGAATGCCA AAATTATTGT TGCAGGTTTT GATTATACAT 300 

TTGGTT CTGA CAAAAAAACA GCAGAAGATT T AAAG GAT T A CTTTGATGGA GAAGTTATCA 360 

TTGTTCCACC TGTAGAAGAT GAGAAAGGAA AGATTAGTTC AACTCGTATC CGTCAAGCTA 420 

TTTTAGATGG AAATGTGAAA GAAGCAGGAA AACTTTTGGG GGCACCGCTT CCATCAAGAG 4 80 

GTATGGTAGT TCATGGTAAT GCTCGTGGTC GTACAATTGG TTATCCGACA GCGAATTTAG 540 

TGCTTTTAGA CCGTACTTAT ATGCCAGCAG ATGGCGTTTA TGTCGTTGAT GTTGAGATTC 600 

AAAGACAGAA GTATCGTGCT ATGGCTAGTG TCGGGAAAAA TGTGACCTTT GAT GGAGAAG 660 

AAGCACGTTT TGAAGTCAAT ATTTTTGATT TTAAT CAAGA TATTTATGGG GAAACCGTCA 720 

TGGTTTATTG GCTTGATCGC ATTCGTGATA TGACCAAATT TGACTCAGTT GACCAATTAG 7 80 

TGGATCAGTT AAAGGCTGAT GAAGAAGTAA CTCGGAATTG GTCTTAAGAG CTTGAGTAAA 840 

TAAAACAAAA AAGAGGTTGT CTGTAACCCA AAAGATAGAT GATTTAGTCT AACTTTTGAG 9 00 

GTCACGACAT TACCTCTTTT TATTCTTTTT CAAAGGTGAA GCCTTCTCCT AGGATTTCAT 960 

GGGCTTCTGT AATAGTTATA AAGGCTT GAG GATCGATTCG ATGAATCATT TCCTTCGTTT 102 0 

TCACAATTTC ATTTCTTCCG ACAATACAGT AGATGATTTT CAAATTTTCT TTACTATAGT 108 0 

AGCCTTGACC AGAAATAAAA GTAACACCTC TTCCGAGGTC ATCATTAATC GCCTTAGCAA 114 0 

GTTGGTCAGG ACGTTTTGTG ATAAT CAT AA AGCCTTTGCC GGCAT AT CCT CCTTCACCAA 1200 

TCAAATCAAT AAC AC GAGAA ACAATAAAAT CAAACAAAAG CGTGTAGGAA ACCAATCTCA 1260 

AATCCTTGAA GATTAGGAGA ATCAACATGA GAATACAAAA ATCTAAGATA AAGAGCAGTT 1320 
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TTCCTATGGA TATATGAGTG TATTTGTTGA GAATAC GAG C TAGAATATCA GTTCCGCCAG 138 0 

TTGTACCTCC AGCATTAAAA ATAATTCCAA GGCCAATTCC CAATAGGATT CCCGCTATAA 14 4 0 

5 

GGGCTGTGAT TAGTAAATCA CCTTGAAGAT CAATAT GAAG GGGAATATGC TCAAAAAAAG 15 0 0 

CTAACCAGGC GGACAAAGCT AAGGTTCCTA GTAAACTAGA ATAGAGGGAT TTGGCTCCAA 15 60 

10 AGATCTTCCA AGCTAGGATG A 15 81 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 879 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

30 

CAAGTTTGTC GAATTGCCAA ACACAGTTGA AGGCTTGATT CACATCACTA ATCTACCTGA 60 

ATTTTATCAT TTCAATGAGC GTGATTTGAC TCTTCGTGGA GAAAAATCAG GTATCACTTT 12 0 

35 CCGAGTGGGT CAGCAGATCC GTATCCGTGT TGAAAGAGCG GATAAAATGA CTGGAGAGAT 180 

TGATTTTTCA TTCGTACCTA GTGAGTTTGA TGTGATTGAA AAAGGCTTGA AACAGTCTAG 24 0 

TCGTAGTGGC AGAGGGCGTG GTTCAAATCG TCGTTCGGAT AAGAAGGAAG ACAAGAGAAA 300 

40 

AT CAGGAC GC TCAAATGATA AGCGTAACAT TTCACAAAAA GACAAGAAGA AAAAAGGAAA 360 

GAAACCTTTT TACAAGGAAG TAG CT AAGAA AGGAGCCAAG CATGGCAAAG GGCGAGGGAA 420 

4 5 AGGTCGTCGC ACAAAATAAA AAGGCACGCC ACGACTATAC AATCGTAGAT ACGCTAGAGG 4 80 

CAGGGATGGT CCTGACTGGA ACTGAAATCA AGAGT GTACG AGCTGCTCGA ATTAATCTCA 540 

AGGATGGCTT TGCTCAAGTG AAAAATGGAG AAGTTTGGCT GAGTAATGTT CATATCGCGC 600 

50 

CTTACGAAGA GGGCAATATC TGGAACCAGG AACCAGAACG TCGTCGTAAA CTCCTGCTCC 660 

ATAAAAA.G CA AATTCAAAAA TTGGAACAAG AGACCAAAGG GACAGGAATG ACCTTAGTTC 720 

55 CCCTTAAGGT CTATATAGAT GGCTACGCTA AGCTTCTTTT AGGACTTGCC AAGGGAAGCA 7 80 

TGACTATGAC AAACGGAGTC TATCAAACGT CGTGAGCAAA TCGAGATATC GCGCGTGTGA 840 

TGAAGCTGTT AATCAGCGAT AAAGAGAGGA ATTGAGATG 879 

60 

(2) INFORMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1550 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

AAAAGCTTAA TAAATCAATA ATTTCTTCTT TTATCCCCAA CCTGTGGATA AAGTTTGGTA 60 

ACATTGTGGA TTATTTTTCA CAGCTTGTGG AAAATTCTTG CTATCTATGG TAAAATATCT 12 0 

CTAGTATTAA ACTTTTAAAT AGTAAAGGAG GAGAAAGGAT TGAAAGAAAA ACAATTTTGG 18 0 

AATCGTATAT TAGAATTTGC ACAAGAAAGA CTGACTCGAT CCATGTATGA TTTCTATGCT 240 

ATTCAAGCTG AACTTATCAA GGTAGAGGAA AATGTTGCCA CTATATTTCT ACCTCGCTCT 30 0 

GAAATGGAAA TGGTCTGGGA AAAACAACTA AAAGATATTA TTGTAGTAGC TGGTTTTGAA 3 60 

ATTTAT GACG CTGAAATAAC TCCCCACTAT ATTTTCACCA AACCTCAAGA TACGACTAGC 42 0 

TCACAAGTTG AAGAAGCTAC AAATTTAACT CTTTATGACT ATAGTCCAAA GTTAGTATCT 480 

ATTCCTTATT CAGATACGGG ATTAAAAGAA AAGTATACCT TTGATAACTT TATTCAAGGG 54 0 

GATGGAAATG TTTGGGCTGT ATCAGCCGCT TTAGCTGTCT CTGAAGATTT GGCTCTGACC 600 

TATAACCCTC TTTTTATCTA TGGAGGACCA GGCCTTGGTA AGACTCACTT ATTAAACGCT 660 

ATTGGAAATG AAATT CTAAA AAATATTCCT AATGCGCGTG TTAAATATAT CCCTGCCGAA 72 0 

AGCTTTATTA ATGACTTT CT TGATCACCTA AGACTTGGGG AAATGGAAAA GTTTAAAAAG 7 80 

ACCTATCGTA GTCTTGATCT TTTGTTAATC GAT GAT AT CC AGTCACTCAG CGGAAAAAAA 84 0 

GTCGCAACTC AGGAAGAATT TTTCAATACC TTTAACGCCC TT CAT GACAA GCAAAAACAG 900 

ATTGTCCTAA CGAGTGATCG TAGTCCAAAA CATCTAGAAG GGCTCGAGGA GAGGCTTGTC 9 60 

ACGCGTTTTA GTTGGGGATT GACACAAACT ATCACACCCC CTGACTTTGA AACACGTATT 102 0 

GCCATTTTAC AAAGTAAAAC GGAACATTTA GGCTACAATT TCCAAAGTGA TACTCTAGAA 108 0 

TACCTAGCTG GGCAATTTGA TTCAAATGTT CGAGATCTTG AGGGAGCCAT CAAC GACAT C 114 0 

ACTTTAATTG C CAGAGTAAA AAAAATCAAG GAT AT CAC TA TTGATATTGC TGCAGAAGCC 12 00 

ATTAGAGCCC GCAAACAAGA TGTTAGCCAA ATGCTCGTCA TCCCAATTGA TAAAATCCAA 12 60 

ACTGAAGTTG GTAACTTTTA TGGTGTTAGT ATCAAAGAAA TGAAGGGAAG TAGACGCCTT 132 0 
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CAAAAT ATT G TTTTGGCCCG TCAAGTAGCC ATGTATTTAT CTAGAGAACT AACAGATAAT 1380 

AGTCTTCCAA AAATTGGGAA GGAATTGGGG GAAAAGTCAT ACCACAGTCA TTCATGCCCA 14 4 0 

TGCCAAAATA AAATCTTGAA TTGAT CAAGA CGATAATTTA CGTTTAGAAA TTGAATCATC 1500 

AAAAGGAAAA TCAAATAATT TGTGGAAACT TTAGGTTTTT ACCTTTTAGC 1550 
(2) INFORMATION FOR SEQ ID NO: 8 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1292 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

GGTATGCGCC AAAACTTCTT AT CAAAAAGA ATCTAGCCAA AGAAGC GACT GCTCAAGCTG 6 0 

TAGGTGAACT TCGTGGTAAA CAAAAATCGG AAGAAAAAGC TCACGCTGAG ATGATTGCAG 12 0 

AAGGAAAAGC AATTAAAGCA CAACTTGAAG CAGAAGAAAC TGTTGTAGAA TTTGTTGAAA 180 

AAGTTGGTCC AGATGGTCGT ACCTTTGGTT CTATTACCAA TAAGAAGATT GCAGAAGAAT 24 0 

T GCAAAAGCA ATTTGGAATT AAGATTGATA AACGT CAT AT TCAAGTACAA GCTCCGATTC 300 

GAGCGGTTGG TTTGATTGAT GTGCCAGTGA AAAT CTATCA AGAT AT CACA AGTGTAATCA 360 

ATCTTCGTGT G AAAGAAG G A TAAGTTTACA CCTTCTTGAC AAGATT GTAA AAGGAAGGGA 42 0 

AGTCTGATGG CAGAAGTAGA AGAGTTACGA GTACAACCTC AAGATATCTT AGCTGAGCAA 4 80 

TCCGTTTTAG GGGCTATCTT TATTGATGAG AGTAAACTTG TTTTTGTGCG AGAATACATT 54 0 

GAGTCTCGGG ACTTTTTTAA GTATGCCCAT CGTTTGATTT TCCAAGCCAT GGTCGATTTA 600 

TCCGATCGTG GTGATGCCAT AGAT GCAACA ACGGTTCGTA CTATCCTTGA TAATCAAGGT 6 60 

GATTTACAGA ATATTGGTGG CTTGTCTTAC TTGGTTGAGA TTGTTAATTC TGTGCCAACT 72 0 

TCTGCTAATG CGGAGTATTA TGCTAAGATT GTTGCAGAAA AAGCAATGCT ACGTCGTTTA 78 0 

ATTGCCAAGT TGACAGAGTC TGTCAACCAA GCTTACGAAG CGTCACAACC AGCTGATGAA 840 

ATTATTGCTC AGGCAGAAAA AGGGTTGATT GATGTCAGTG AAAATGCAAA TCGAAGCGGG 9 00 

TTTAAGAACA TTCGAGATGT GTTGAATCTC AACTTT GGAA ATCTGGAAGC TCGCTCGCAA 9 60 

CAAACGACCG ATATTACAGG TATTGCGACA GGTTATCGTG ATTTGGATCA TATGACAACA 102 0 
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GGACTTCATG AGGAGGAGTT GATTATCTTA GCAGCTCGTC CAGCAGTTGG TAAGACAGCA 108 0 

TTTGCCTTGA ATATCGCTCA GAATATTGGG ACTAAGTTGG ACAAAACGGT TGCTATTTTT 114 0 

TCACTCGAAA TGGGTGCGGA AAGCTTGGTA GACCGTATGT TAGCTGCAGA AGGCTTGGTG 1200 

GAGTCACATT CTATCCGTAC AGGGCAATTG ACAGATGAGG AGTGGCAAAA ATATACTATT 12 60 

GCTCAGGGTA ATC GTACTAA CGCCAGTATC TA 1292 
(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

30 

AGGCTGATCC TGCGGTTATT GACCCATGGA AGAGCACCTA GAAGGAGATC ATTCCCAGAC 60 

GATATTTTAG TTTTATCCTA GTAGCCTTCC CTGGCTATTT TAGGAGCTCG TCTCTACTAT 120 

35 GTATTTCCGA TTTGATTACT ATAGTCAGAA TTTAGGAGAG ATTTTTGCCA TTGGAATGGT 180 

GGTTGGCCAT TTACGGTGGT TGATAACTGG GGCTCTTGTG CTCTATATCT TTGCTGACCG 2 40 

T AAA CT CATC AATACTT GGG ATTTTCTAGA TATTGCGGCG CCTAGCGTTA TGATTGCTCA 300 

40 

AAGTTTGGGG CGTTGGGGTA ATTTCTTTAA CCAAGAAGCT TATGGTGCAA CAGTGGATAA 360 

TCTGGATTAT CTACCTGGCT TTATCCGTGA CCAGATGTAT ATT GAGGGGA GCTACCGTCA 42 0 

45 ACCGACTTTC CTTTAT GAGT CTCTATGGAA TCTGCTTGGC TTTGCCTTGA TTCTGATTTT 4 80 

TAGAC GGAAA TGGAAGAGTC T CAGAC GAG G T CAT AT CAC G GCCTTTTACT TGATTTGGTA 54 0 

TGGTTTCGGT CGTATGGTCA TCGAAGGTAT GCGAACAGAT AGTCTCATGT TCTTCGGACT 600 

50 

TCGAGTGTCC CAATGGCTGT CAGTTGTCCT TATCGGTCTC GGTATAATGA TCGTTATTTA 660 

TCAAAATCGA AAGAAGGCCC CTTACTATAT TACAGAGGAG GAAAACTAAA TGTTAGAAGT 720 

55 TGCATATATT CTTGTTGCCC TAGCTTTGAT TGTCTTTTTG GTCTATCTGA TCATTACTGT 78 0 

ACAAAAGCTT GGTCGTGTCA TCGATGAAAC AGAAAAGACG ATTAAAACCT TGACTTCAGA 84 0 

TGTGGATGTG ACCTTGCATC ACACCAATGA GTTGTTGGCT AAGGTCAATG TCTTGGCAGA 90 0 

60 

TGATATCAAT GTCAAGGTGG CTACGATTGA TCCACTCTTC AGTGCTGTTG CAGATTTATC 960 
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-149- 



TCTATCTGTT TCAGACCTCA ATGACCATGC GCGTGTCTTG AGCAAGAAAG CTTCATCAGC 102 0 
TGGTTCAAAA ACACTCAAGA CTGGTGCAAG TCTGTCAGCT CTTCGTCTTG CAAGTAAATT 108 0 

5 

T T T C AAAAAA. TAAAAAAGGA GAATCCTTAT GGGTAAATTA TCCTCAATCC TTTTAGGAAC 114 0 

GGTTT CAGGT GCAGCTCTTG CCTTGTTTTT AACAAGTGAT AAGGGCAAAC AAGTTTGCAG 1200 
10 TCAGGCTCAA GATTTTCTAG ATGATTTGAG AGAAGATCCG GAGTATGCCA AGGAGCAAGT 12 60 
CT GT GAAAAA CTGACAGAAG TTAAGGAGCA GGCTACAGAT TTTGTTCTGA AAACAAAAGA 132 0 
ACAGGTTGAG T CAGGT GAAA TCACTGTGGA CAGTATACTT GCTCAAGCTA AATCCTATGC 1380 

15 

TTTTCAAGCG ACAGAAGCAT CAAAAAATCA ATTAAATAAT CTCAAGGAAC AATGGCAAGA 144 0 
AAAAGCCGAA GCTCTTGATG ACTCAGAAGA GATTGTGATT CATATAACAG AAGAATAAAC 1500 

2 0 CATCACCATC TCCGGACGGA CTATGTATCT GGGGATGGTG ATTTTTATCT GGAAT CTAGT 1560 

CTTTGTGGTA TAATAATTAC TATGCAGAAA AAACCAACGT CAGCCTATGT GCACATCCCA 162 0 

TTTTGTACCC AGATTTGTTA TTATTGTGAT TTTTCAAAGG TCTTCATCAA AAATCAGCCA 168 0 

25 

GTCGACAGCT ATTTAGAGCA TCTGCTGGAA GAGTTTCGTT CTTATGATAT TGAAAAGTTG 17 4 0 
TCAACCCTTT ATATCGGTGG TGGAACACGA CAGCCCTGTC GGCTCCGCAA CTGGAGGTGT 1800 

3 0 TACTGAATGG CTTGACTAAA AACTTGGATT TGTCTGCTTG GAGAGTGACC ATTGAAGCCA 1860 

TCCAGGCGAT TTGGAA 1876 
(2) INFORMATION FOR SEQ ID NO: 85: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

45 

(iv) ANTI-SENSE: NO 



50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
TTGGAA GATT TCCCACTTTC AGTGACCAAC CCATACGGTC GTACTAAGCT CATGCTAGAG 60 
5 5 GAAATTTTGA CTGATATTTA CAAAGCAGAC TCAGAATGGA ATGTTGTCTT GCTTCGTTAC 12 0 

TTTAACCCAA TCGGAGTCCA TGAGAGTGGT GATTTGGGAG AAAATCCAAA CGGTATTCCA 180 
AACAATCTCT TGCCATATGT GACTCAAGTA GCCGTTGGAA AATTAGAGCA AGTGCAAGTG 240 

60 

TTTGGAGACG AT T AC GATA C GGAAGATGGA ACAGGTGTTC GTGACTATAT CCACGTTGTC 300 
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-150- 



GATTTGGCTA AGGGTCACGT TGCAGCTTTG AAAAA\ATCC AAAAAGGTTC AGGACTAAAC 360 

GTTTATAACC TTGGAACTGG TAAAGGTTAC TCAGTTCTTG AAATTATCCA AAA CAT G GAA 42 0 

5 

AAAGCGGTGG GATGTCCTAT TCCTTACCGC ATCGTAGAAC GTCGCCCAGG TGATATCGCT 4 80 

GCCTGCTACT CAGACCCAGC AAAGGCTAAA GCAGAACTCG GTTGGGA^GC AGAACTCGAC 54 0 

10 ATCACCCAAA TGTGTGAAGG CCATGGCGTT GGCAGAGCAA GCATCCAAAT GGATTTGAAG 600 

ACTAAGATGA TGATTTCAAT CATCGTCCCT TGTTTAACGA AGAGGAAGTA CTTCCTCTTT 660 

TTTATCAGGC TCTGGAAGCT TTACTTCCAG ATTTGGAAAC AAAATCGAGT ATGTCTTTGT 720 

15 

CGATGATGGA TCAAGTGATG GGACCTTGGA ACTCTTAAAG GCCTATCGGG AGCAAAATCC 780 

GGCAGTCCAT TATATTTCTT TCTCTCGAAA TTTTGGCAAA GAAGCAGCCC TTTATGCAGG 840 

2 0 CTTGCAATAT GCGACAGGAG ATTTGGTGGT GGTGATGGAT GCAGACCTCC AAGATCCTCC 9 00 

TAGTATGTTG TTTGAGATGA AAAATGTACT AGACAAAAAT GTAGACTTGG ACTGCGTTGG 9 60 

GACACGGAGA ACTAGTCGGG AGGGAGAACC CTTCTTTCGC AGTTTCTGTG CTGTTCTCTT 102 0 

25 

TTATCGCCTC ATGCAAAAAA TCAGCCCAGT AGCTCTGCCG TCGGGTGTCC GTGATTTTCG 10 80 

TATGATGAGA AGGTCTGTGG TCGATGCCAT TTTAAGCTTG ACTGAGTCCA ATCGTTTTTC 1140 

3 0 TAAGGGACTC TTTGCCTGGG TCGGCTTTAA AACCCACTAT CTGGACTATC CAAATGTCGA 1200 

AAGGCAGGCT GGCAAGACCA GTTGGAGTTT TAGGCAACTC TTTTTTTACT C CATT GAAGG 12 60 

GATTGTTAAT TTTTCAGATT TCCCTTTGAC TATAGCCTTT GTAGCTGGTC TCCTATCTTG 132 0 

35 

TTTT CTTTCT CTGCTGATGA CCTTTTTTGT TGTGGTTCGG ACCCTCATTT TGGGCAATCC 1380 

GACATCTGGT TGGACCTCTC TGATGGCTGT TATTCTCTAT CTTGGAGGCA TTCAACTCTT 14 4 0 

4 0 GACCATTGGG ATT CTCGGTA AGTATAATCA GTAAGATTAT TTAGAGACTA AAAAAAAGAC 1500 

CACTTTATCT TATCAAAGAA AAAGTGACCT TCCTGATTTT ACAGAAACCT AAAGTGAAAA 1560 

GACTATAATT TTCC 1574 

45 

(2) INFORMATION FOR SEQ ID NO: 8 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 969 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

55 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

GTTATAATTA TTGATGATAA TTATAGTAAT GTAAATTTAA GAAATAAAAT TAT C C AT CAA 6 0 

TTTGGCTATA CCAATCATAG AATTAAGTTA ATTTTAAGTA ATGAAGATTT AGGTGCAACT 12 0 

AATGCCAGAA ACATAGGTAT CAAAAATTCT AGAGGTAAGT AT AT AT C ATT TTTAGACGAT 18 0 

GATGATGAAT ATATGCCAGA TCGAATTTTA AAGTTGATGG CTTGTTTTAA AAAGAGTAGA 2-10 

ATGAAGAATT TAGCTTTAGT TTATAGTTAT GG CAT AATAA TTTATCCAAA TGGTACACGA 300 

GAA GAG GAGA AGACCGATTT TGTTGGAAAT CCCTTGTTTG TTCAAATGGT TCACAATATA 360 

GCAGGTACGT CATTTTGGTT GTGTAAAAAA GAGGTGCTAG AATTAATTAA TGGTTTTGAG 42 0 

AAAATAGATT CACATCAGGA CGGTGTTGTT TTATTAAAAC TACTTGCTCA AG GAT AC CAA 4 80 

ATTGATATAG TGCGAGAATT CTTGGTGAAT TACTACGCTC ACAGTAAAGA AAACGGTATC 54 0 

ACTGGAGTGA CACAAAAAAC AATTAATGCA GAT GAAGAAT ATTATAATTA CTGTAGGAAA 60 0 

TATTTTAATT TATTGAGTTT CAACGAGAGA ATATTGGTTA CAAAGAAATA TTATTCTTTA 660 

AACATAAAGC GGTTACTATT AATAGGAGAC AAATGCAAGG CTTTAAAAGT AAT CAAGAAG 720 

GCAAGAGAAG AAAAAATTTT TAACGAATTT CTTTTTTTGA AATATATGTT ATTATATAAC 78 0 

GTAGTTTTTT CTATTGTATA TAT GACAACT ATGTTCAATT AAAATTTAGA AAGTGAGAAA 84 0 

CTATTGTGTA TACTATTATA AATT CAAT AT AAACATTTAG GTTAATTAAC GATAATTAAT 90 0 

CGGTGCTGGG TCATTAATTG CTAATTTAAT GCAGCACTAT TAATGCTCAG GTGTTGAATG 960 

AATT AAT GC 969 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1353 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1350 

(D) OTHER INFORMATION : DNA B 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

ATG GCA GAA GTA GAA GAG TTA CGA GTA CAA CCT CAA GAT ATC TTA GCT 4 8 

Met Ala Glu Val Glu Glu Leu Arg Val Gin Pro Gin Asp lie Leu Ala 
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15 10 15 

GAG CAA TCC GTT TTA GGG GCT ATC TTT ATT GAT GAG AGT AAA CTT GTT 9 6 

Glu Gin Ser Val Leu Gly Ala lie Phe He Asp Glu Ser Lys Leu Val 
5 20 25 30 

TTT GTG CGA GAA TAC ATT GAG TCT CGG GAC TTT TTT AAG TAT GCC CAT 14 4 

Phe Val Arg Glu Tyr He Glu Ser Arg Asp Phe Phe Lys Tyr Ala His 
35 40 45 

10 

CGT TTG ATT TTC CAA GCC ATG GTC GAT TTA TCC GAT CGT GGT GAT GCC 192 

Arg Leu He Phe Gin Ala Met Val Asp Leu Ser Asp Arg Gly Asp Ala 
50 55 60 

15 ATA GAT GCA ACA ACG GTT CGT ACT ATC CTT GAT AAT CAA GGT GAT TTA 24 0 

He Asp Ala Thr Thr Val Arg Thr He Leu Asp Asn Gin Gly Asp Leu 
65 70 75 80 

CAG AAT ATT GGT GGC TTG TCT TAC TTG GTT GAG ATT GTT AAT TCT GTG 2 88 

2 0 Gin Asn He Gly Gly Leu Ser Tyr Leu Val Glu He Val Asn Ser Val 

85 90 95 

CCA ACT TCT GCT AAT GCG GAG TAT TAT GCT AAG ATT GTT GCA GAA AAA 336 
Pro Thr Ser Ala Asn Ala Glu Tyr Tyr Ala Lys He Val Ala Glu Lys 
25 100 105 110 

GCA ATG CTA CGT CGT TTA ATT GCC AAG TTG ACA GAG TCT GTC AAC CAA 3 84 

Ala Met Leu Arg Arg Leu He Ala Lys Leu Thr Glu Ser Val Asn Gin 
115 120 125 

30 

GCT TAC GAA GCG TCA CAA CCA GCT GAT GAA ATT ATT GCT CAG GCA GAA 4 32 

Ala Tyr Glu Ala Ser Gin Pro Ala Asp Glu He He Ala Gin Ala Glu 

130 135 140 

3 5 AAA GGG TTG ATT GAT GTC AGT GAA AAT GCA AAT CGA AGC GGG TTT AAG 4 80 

Lys Gly Leu He Asp Val Ser Glu Asn Ala Asn Arg Ser Gly Phe Lys 
145 150 155 160 

AAC ATT CGA GAT GTG TTG AAT CTC AAC TTT GGA AAT CTG GAA GCT CGC 528 

4 0 Asn He Arg Asp Val Leu Asn Leu Asn Phe Gly Asn Leu Glu Ala Arg 

165 170 175 

TCG CAA CAA ACG ACC GAT ATT ACA GGT ATT GCG ACA GGT TAT CGT GAT 576 
Ser Gin Gin Thr Thr Asp He Thr Gly He Ala Thr Gly Tyr Arg Asp 
45 180 185 190 

TTG GAT CAT ATG ACA ACA GGA CTT CAT GAG GAG GAG TTG ATT ATC TTA 62 4 

Leu Asp Kis Met Thr Thr Gly Leu His Glu Glu Glu Leu He He Leu 

195 200 205 

50 

GCA GCT CGT CCA GCA GTT GGT AAG ACA GCA TTT GCC TTG AAT ATC GCT 672 

Ala Ala Arg Pro Ala Val Gly Lys Thr Ala Phe Ala Leu Asn He Ala 

210 215 220 

55 CAG AAT ATT GGG ACT AAG TTG GAC AAA ACG GTT GCT ATT TTT TCA CTC 720 
Gin Asn He Gly Thr Lys Leu Asp Lys Thr Val Ala He Phe Ser Leu 
225 230 235 240 

GAA ATG GGT GCG GAA AGC TTG GTA GAC CGT ATG TTA GCT GCA GAA GGC 7 68 

60 Glu Met Gly Ala Glu Ser Leu Val Asp Arg Met Leu Ala Ala Glu Gly 
245 250 255 
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TTG GTG GAG TCA CAT TCT ATC CGT ACA GGG CAA TTG ACA GAT GAG GAG 

Leu Val Glu Ser His Ser lie Arg Thr Gly Gin Leu Thr Asp Glu Glu 

260 265 270 

TGG CAA AAA TAT ACT ATT GCT CAG GGT AAT CTA GCT AAC GCC AGT ATC 

Trp Gin Lys Tyr Thr lie Ala Gin Gly Asr. Leu Ala Asn Ala Ser lie 

275 280 285 

TAT ATC GAT GAT ACG CCA GGT ATT CGG ATT ACA GAG ATT CGT TCT CGT 

Tyr lie Asp Asp Thr Pro Gly lie Arg lie Thr Glu lie Arg Ser Arg 

290 295 300 

TCT CGT AAA TTG GCT CAA GAA ACT GGA AAT CTT GGT TTG ATT TTG ATA 

Ser Arg Lys Leu Ala Gin Glu Thr Gly Asn Leu Gly Leu He Leu He 

305 310 315 320 

GAC TAT TTG CAA CTT ATC ACG GGA ACT GGT CGA GAA AAT CGT CAA CAA 

Asp Tyr Leu Gin Leu He Thr Gly Thr Gly Arg Glu Asn Arg Gin Gin 

325 330 335 

GAA GTT TCT GAA ATT TCT CGT CAG TTG AAA ATA CTA GCC AAG GAA TTG 

Glu Val Ser Glu He Ser Arg Gin Leu Lys He Leu Ala Lys Glu Leu 

340 345 350 

AAG GTT CCA GTA ATC GCT CTG AGT CAG CTT TCT CGT GGT GTA GAA CAA 

Lys Val Pro Val He Ala Leu Ser Gin Leu Ser Arg Gly Val Glu Gin 

355 360 365 

CGT CAG GAC AAG AGA CCG GTC TTG TCT GAT ATT CGT GAA TCT GGG TCT 

Arg Gin Asp Lys Arg Pro Val Leu Ser Asp He Arg Glu Ser Gly Ser 

370 375 380 

ATT GAG CAG GAC GCT GAT ATC GTA GCT TTT CTC TAT CGC GAT GAC TAC 

He Glu Gin Asp Ala Asp He Val Ala Phe Leu Tyr Arg Asp Asp Tyr 

385 390 395 400 

TAT GAA CGT GGT GGT GAA GAA GAG GAG GGT ATC CCA AAT AAT AAG GTG 

Tyr Glu Arg Gly Gly Glu Glu Glu Glu Gly He Pro Asn Asn Lys Val 

405 410 415 

GAA GTT ATT ATC GAG AAA AAC CGT AGT GGA GCT CGT GGA ACA GTG GAA 

Glu Val He He Glu Lys Asn Arg Ser Gly Ala Arg Gly Thr Val Glu 

420 425 430 

TTG ATT TTC CAA AAA GAA TAC AAT AAA TTT TCA AGT ATC TCA AAG AGG 

Leu He Phe Gin Lys Glu Tyr Asn Lys Phe Ser Ser He Ser Lys Arg 

435 440 445 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 450 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Met Ala Glu Val Glu Glu Leu Arg Val Gin Pro Gin Asp He Leu Ala 
15 10 15 

Glu Gin Ser Val Leu Gly Ala He Phe He Asp Glu Ser Lys Leu Val 
20 25 30 

Phe Val Arg Glu Tyr He Glu Ser Arg Asp Phe Phe Lys Tyr Ala His 
35 40 45 

Arg Leu He Phe Gin Ala Met Val Asp Leu Ser Asp Arg Gly Asp Ala 
50 55 60 

He Asp Ala Thr Thr Val Arg Thr He Leu Asp Asn Gin Gly Asp Leu 



20 Gin Asn He Gly Gly Leu Ser Tyr Leu Val Glu He Val Asn Ser Val 
85 90 95 



35 Asn He Arg Asp Val Leu Asn Leu Asn Phe Gly Asn Leu Glu Ala Arg 
165 170 175 



50 Glu Met Gly Ala Glu Ser Leu Val Asp Arg Met Leu Ala Ala Glu Gly 
245 250 255 



60 



Tyr He Asp Asp Thr Pro Gly He Arg He Thr Glu He Arg Ser Arg 
290 295 300 
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Ser Arg Lys Leu Ala Gin Glu Thr Gly Asn Leu Gly Leu He Leu He 
305 310 315 320 

Asp Tyr Leu Gin Leu He Thr Gly Thr Gly Arg Glu Asr. Arg Gin Gin 
5 325 330 335 

Glu Val Ser Glu He Ser Arg Gin Leu Lys He Leu Ala Lys Glu Leu 
340 345 350 

10 Lys Val Pro Val He Ala Leu Ser Gin Leu Ser Arg Gly Val Glu Gin 
355 360 365 

Arg Gin Asp Lys Arg Pro Val Leu Ser Asp He Arg Glu Ser Gly Ser 
370 ' 375 380 

15 

He Glu Gin Asp Ala Asp He Val Ala Phe Leu Tyr Arg Asp Asp Tyr 
385 390 395 400 

Tyr Glu Arg Gly Gly Glu Glu Glu Glu Gly He Pro Asn Asn Lys Val 
20 ' 405 410 415 

Glu Val He He Glu Lys Asn Arg Ser Gly Ala Arg Gly Thr Val Glu 
420 425 430 

2 5 Leu He Phe Gin Lys Glu Tyr Asn Lys Phe Ser Ser He Ser Lys Arg 
435 440 445 



(2) INFORMATION FOR SEQ ID NO:89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1785 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1..1782 

(D) OTHER INFORMATION: DNA G 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

ATG ATA ACC ATG GAG GTA TTG TGT ATG GTT GAC AAA CAA GTC ATT GAA 

55 Met He Thr Met Glu Val Leu Cys Met Val Asp Lys Gin Val He Glu 
15 10 15 

GAA ATC AAA AAC AAT GCC AAC ATT GTG GAA GTC ATA GGA GAT GTG ATT 

Glu He Lys Asn Asn Ala Asn He Val Glu Val He Gly Asp Val He 
60 20 25 30 
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TCT TTA CAA AAG GCA GGA CGG AAC TAT CTA GGG CTC TGT CCT TTT CAT 
Ser Leu Gin Lys Ala Glv Arg Asp. Tyr Leu Gly Leu Cys Pro Phe His 
35 40 45 

GGT GAA AAA ACA CCT TCT TTC AGC GTT GTA GAG GAC AAG CAG TTT TAC 
Gly Glu Lys Thr Pro Ser Phe Ser Val Val Glu Asp Lys Gin Phe Tyr 
50 55 60 

CAC TGT TTT GGT TGT GGT CGC TCA GGT GAT GTC TTT AAA TTC ATC GAG 
Kis Cys Phe Gly Cys Gly Arg Ser Gly Asp Val Phe Lys Phe lie Glu 
65 70 75 80 

GAG TAC CAA GGG GTT ACC TTT ATG GAG GCT GTC CAA ATC TTA GGT CAG 
Glu Tyr Gin Gly Val Thr Phe Met Glu Ala Val Gin He Leu Gly Gin 
85 90 95 

CGT GTC GGG ATT GAG GTT GAA AAA CCG CTT TAT AGT GAA CAG AAG CCA 
Arg Val Gly He Glu Val Glu Lys Pro Leu Tyr Ser Glu Gin Lys Pro 
100 105 110 

GCC TCG CCT CAC CAA GCT CTT TAT GAT ATG CAC GAA GAT GCG GCT AAA 
Ala Ser Pro His Gin Ala Leu Tyr Asp Met His Glu Asp Ala Ala Lys 
115 120 125 

TTT TAC CAT GCT ATT CTC ATG ACA ACG ACT ATG GGC GAA GAG GCC AGA 
Phe Tyr His Ala He Leu Met Thr Thr Thr Met Gly Glu Glu Ala Arg 
130 135 140 

AAT TAC CTT TAT CAG CGG GGT TTG ACA GAT GAA GTG CTT AAA CAT TTT 
Asn Tyr Leu Tyr Gin Arg Gly Leu Thr Asp Glu Val Leu Lys His Phe 
145 150 155 160 

TGG ATT GGT TTA GCA CCT CCA GAA CGA AAC TAT CTC TAT CAA CGT TTG 
Trp He Gly Leu Ala Pro Pro Glu Arg Asn Tyr Leu Tyr Gin Arg Leu 
165 170 175 

TCT GAT CAG TAT CGT GAA GAG GAT TTA CTG GAT TCA GGC CTG TTT TAT 
Ser Asp Gin Tyr Arg Glu Glu Asp Leu Leu Asp Ser Gly Leu Phe Tyr 
180 185 190 

CTT TCG GAT GCC AAT CAA TTT GTA GAC ACC TTT CAC AAT CGC ATT ATG 
Leu Ser Asp Ala Asn Gin Phe Val Asp Thr Phe His Asn Arg He Met 
195 200 205 

TTT CCC CTG ACA AAT GAC CAA GGA AAG GTC ATT GCC TTC TCA GGT CGT 
Phe Pro Leu Thr Asn Asp Gin Gly Lys Val He Ala Phe Ser Gly Arg 
210 215 220 

ATC TGG CAA AAA ACG GAT TCA CAA ACT TCT AAG TAT AAA AAC AGC CGA 
He Trp Gin Lys Thr Asp Ser Gin Thr Ser Lys Tyr Lys Asn Ser Arg 
225 230 235 240 

TCG ACT GTA ATT TTT AAC AAA AGT TAC GAA TTA TAT CAT ATG GAT AGG 
Ser Thr Val He Phe Asn Lys Ser Tyr Glu Leu Tyr His Met Asp Arg 
245 250 255 

GCA AAA AGA TCT TCT GGA AAA GCT AGT GAG ATT TAC CTG ATG GAA GGA 
Ala Lys Arg Ser Ser Gly Lys Ala Ser Glu He Tyr Leu Met Glu Gly 
260 265 270 

TTC ATG GAT GTT ATT GCA GCC TAT CGG GCT GGA ATC GAA AAT GCT GTG 
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Phe Met Asp Val He Ala Ala Tyr Ar 
275 280 

GCG TCG ATG GGA ACG GCC TTG AGT CGA GAG CAT GTT GAG CAT CTG AAA 
Ala Ser Met Gly Thr Ala Leu Ser Arg Glu His Val Glu His Leu Lys 
290 295 300 

AGG TTA ACC AAG AAA TTG GTT CTT GTT TAC GAT GGA GAT AAG GCT GGG 
Arg Leu Thr Lys Lys Leu Val Leu Val Tyr Asp Gly Asp Lys Ala Gly 
305 ' 310 315 320 

CAA GCC GCG ACA TTG AAA GCA TTG GAT GAA ATT GGT GAT ATG CCT GTG 
Gin Ala Ala Thr Leu Lys Ala Leu Asp Glu He Gly Asp Met Pro Val 
325 330 335 

CAA ATC GTC AGC ATG CCT GAT AAC TTG GAT CCT GAT GAA TAT CTA CAA 
Gin He Val Ser Met Pro Asp Asn Leu Asp Pro Asp Glu Tyr Leu Gin 
340 345 350 

AAA AAT GGT CCA GAA GAC TTG GCC TAT CTA TTA ACG AAA ACT CGT ATT 
Lys Asn Gly Pro Glu Asp Leu Ala Tyr Leu Leu Thr Lys Thr Arg He 
355 360 365 

AGT CCG ATT GAG TTC TAC ATT CAT CAG TAC AAA CCT GAA AAC GGT GAA 
Ser Pro He Glu Phe Tyr He His Gin Tyr Lys Pro Glu Asn Gly Glu 
370 375 380 

AAT CTG CAG GCT CAG ATT GAG TTT CTT GAA AAA ATA GCT CCC TTG ATT 
Asn Leu Gin Ala Gin He Glu Phe Leu Glu Lys He Ala Pro Leu He 
385 390 395 400 

GTT CAA GAA AAG TCC ATC GCT GCT CAA AAC AGC TAT ATT CAT ATT TTA 
Val Gin Glu Lys Ser He Ala Ala Gin Asn Ser Tyr He His He Leu 
405 410 415 

GCT GAC AGT CTG GCG TCC TTT GAT TAT ACC CAG ATT GAG CAG ATT GTT 
Ala Asp Ser Leu Ala Ser Phe Asp Tyr Thr Gin He Glu Gin He Val 
420 425 430 

AAT GAG AGT CGT CAG GTG CAA AGG CAG AAT CGC ATG GAA AGA ATT TCC 
Asn Glu Ser Arg Gin Val Gin Arg Gin Asn Arg Met Glu Arg He Ser 
435 440 445 

AGA CCG ACG CCA ATC ACC ATG CCT GTC ACC AAG CAG TTA TCG GCT ATT 
Arg Pro Thr Pro He Thr Met Pro Val Thr Lys Gin Leu Ser Ala He 
450 455 460 

ATG AGG GCA GAA GCC CAT CTA CTC TAT CGG ATG ATG GAA TCC CCT CTT 
Met Arg Ala Glu Ala Kis Leu Leu Tyr Arg Met Met Glu Ser Pro Leu 
465 470 475 480 

GTT TTG AAC GAT TAC CGT TTG CGA GAA GAC TTT GCA TTT GCT ACA CCT 
Val Leu Asn Asp Tyr Arg Leu Arg Glu Asp Phe Ala Phe Ala Thr Pro 
485 490 495 

GAA TTT CAG GTC TTA CAT GAC TTG CTT GGC CAG TAT GGA AAT CTT CCT 
Glu Phe Gin Val Leu His Asp Leu Leu Gly Gin Tyr Gly Asn Leu Pro 
500 505 510 

CCA GAA GTT TTA GCA GAG CAG ACA GAG GAA GTT GAA AGA GCT TGG TAC 
Pro Glu Val Leu Ala Glu Gin Thr Glu Glu Val Glu Arg Ala Trp Tyr 



WO 98/26072 



PCT/US97/22578 



CAA GTT TTA GCT CAG GAT TTG CCT GCT GAG ATA TCG CCG CAG GAA CTT 

Gin Val Leu Ala Gin Asp Leu Pro Ala Glu lie Ser Pro Gin Glu Leu 

5 530 535 540 

AGT GAA GTA GAG ATG ACT CGA AAC AAG GCT CTC TTG AAT CAG GAC AAT 

Ser Glu Val Glu Met Thr Arg Asn Lys Ala Leu Leu Asn Gin Asp Asn 

545 550 555 560 

10 

ATG AGA ATC AAA AAG AAG GTG CAG GAA GCT AGC CAT GTA GGA GAT ACA 

Met Arg He Lys Lys Lys Val Gin Glu Ala Ser His Val Gly Asp Thr 

565 570 575 

15 GAT ACA GCC CTA GAA GAA TTG GAA CGT TTA ATT TCC CAA AAG AGA AGA 

Asp Thr Ala Leu Glu Glu Leu Glu Arg Leu He Ser Gin Lys Arg Arg 

580 585 590 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 594 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 



30 



35 Met He Thr Met Glu Val Leu Cys Met Val Asp Lys Gin Val He Glu 
15 10 15 

Glu He Lys Asn Asn Ala Asn He Val Glu Val He Gly Asp Val He 
20 25 30 

40 

Ser Leu Gin Lys Ala Gly Arg Asn Tyr Leu Gly Leu Cys Pro Phe His 
35 40 " 45 

Gly Glu Lys Thr Pro Ser Phe Ser Val Val Glu Asp Lys Gin Phe Tyr 
45 50 55 60 

His Cys Phe Gly Cys Gly Arg Ser Gly Asp Val Phe Lys Phe He Glu 
65 70 75 80 

5 0 Glu Tyr Gin Gly Val Thr Phe Met Glu Ala Val Gin He Leu Gly Gin 
85 90 95 



Ala Ser Pro His Gin Ala Leu Tyr Asp Met His Glu Asp Ala Ala Lys 
115 120 125 

Phe Tyr His Ala He Leu Met Thr Thr Thr Met Gly Glu Glu Ala Arg 
130 135 140 
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25 Phe Met Asp Val He Ala Ala Tyr Arg Ala Gly He Glu Asn Ala Val 
275 280 285 



4 0 Lys Asn Gly Pro Glu Asp Leu Ala Tyr Leu Leu Thr Lys Thr Arg He 
355 " 360 365 



55 Asn Glu Ser Arg Gin Val Gin Arg Gin Asn Arg Met Glu Arg He Ser 
435 440 445 



Met Arg Ala Glu Ala His Leu Leu Tyr Arg Met Met Glu Ser Pro Leu 
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15 Ser Glu Val Glu Met Thr Arg Asn Lys Ala Leu Leu Asn Gin Asp Asn 
545 550 555 560 



Asp Thr Ala Leu Glu Glu Leu Glu Arg Leu lie Ser Gin Lys Arg Arg 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1..897 

(D) OTHER INFORMATION: Era 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

50 ATG ACT TTT AAA TCA GGC TTT GTA GCC ATT TTA GGA CGT CCC AAT GTT 

Met Thr Phe Lys Ser Gly Phe Val Ala He Leu Gly Arg Pro Asn Val 

15 10 15 

GGG AAG TCA ACC TTT TTA AAT CAC GTT ATG GGG CAA AAG ATT GCC ATC 

55 Gly Lys Ser Thr Phe Leu Asn His Val Met Gly Gin Lys He Ala He 

20 25 30 

ATG AGT GAC AAG GCG CAG ACA ACG CGC AAT AAA ATC ATG GGA ATT TAC 

Met Ser Asp Lys Ala Gin Thr Thr Arg Asn Lys He Met Gly He Tyr 
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ACG ACT GAT AAG GAG CAA ATT GTC TTT ATC GAC ACA CCA GGG ATT CAC 192 

Thr Thr Asp Lys Glu Gin lie Val Phe lie Asp Thr Pro Gly He Kis 

50 55 60 

5 AAA CCT AAA ACA GCT CTC GGA GAT TTC ATG GTT GAG TCT GCC TAC AGT 2 40 

Lys Pro Lys Thr Ala Leu Gly Asp Phe Met Val Glu Ser Ala Tyr Ser 

65 70 ~ 75 80 

ACC CTT CGC GAA GTG GAC ACT GTT CTT TTC ATG GTG CCT GCT GAT GAA 2 8B 

10 Thr Leu Arg Glu Val Asp Thr Val Leu Phe Met Val Pro Ala Asp Glu 

85 90 95 

GCG CGT GGT AAG GGG GAC GAT ATG ATT ATC GAG CGT CTC AAG GCT GCC 3 36 

Ala Arg Gly Lys Gly Asp Asp Met He He Glu Arg Leu Lys Ala Ala 

15 100 105 110 

AAG GTT CCT GTG ATT TTG GTG GTG AAT AAA ATC GAT AAG GTC CAT CCA 384 

Lys Val Pro Val He Leu Val Val Asn Lys He Asp Lys Val His Pro 

115 120 125 

20 

GAC CAG CTC TTG TCT CAG ATT GAT GAC TTC CGT AAT CAA ATG GAC TTT 4 32 

Asp Gin Leu Leu Ser Gin lie Asp Asp Phe Arg Asn Gin Met Asp Phe 

130 135 ' ' 140 

2 5 AAG GAA ATT GTT CCA ATC TCA GCC CTT CAG GGA AAT AAC GTG TCT CGT 4 80 

Lys Glu He Val Pro He Ser Ala Leu Gin Gly Asn Asn Val Ser Arg 

145 150 155 160 

CTA GTG GAT ATT TTG AGT GAA AAT CTG GAT GAA GGT TTC CAA TAT TTC 52 8 

3 0 Leu Val Asp He Leu Ser Glu Asn Leu Asp Glu Gly Phe Gin Tyr Phe 

165 170 175 

CCG TCT GAT CAA ATC ACA GAT CAT CCA GAA CGT TTC TTA GTT TCA GAA 57 6 

Pro Ser Asp Gin He Thr Asp His Pro Glu Arg Phe Leu Val Ser Glu 

35 180 185 190 

ATG GTT CGC GAG AAA GTC TTG CAC CTA ACT CGT GAA GAG ATT CCG CAT 624 

Met Val Arg Glu Lys Val Leu His Leu Thr Arg Glu Glu He Pro His 

195 200 205 

40 

TCT GTA GCA GTA GTT GTT GAC TCT ATG AAA CGA GAC GAA GAG ACA GAC 67 2 

Ser Val Ala Val Val Val Asp Ser Met Lys Arg Asp Glu Glu Thr Asp 

210 215 220 

45 AAG GTT CAC ATC CGT GCA ACC ATC ATG GTC GAG CGC GAT AGC CAA AAA 72 0 

Lys Val His He Arg Ala Thr He Met Val Glu Arg Asp Ser Gin Lys 

225 230 235 240 

GGG ATT ATC ATC GGT AAA GGT GGC GCT ATG CTT AAG AAA ATC GGT AGT 7 68 

50 Gly He He He Gly Lys Gly Gly Ala Met Leu Lys Lys He Gly Ser 

245 250 255 

ATG GCC CGT CGT GAT ATC GAA CTC ATG CTA GGA GAC AAG GTC TTC CTA 816 

Met Ala Arg Arg Asp He Glu Leu Met Leu Gly Asp Lys Val Phe Leu 

55 260 265 270 

GAA ACC TGG GTC AAG GTC AAG AAA AAC TGG CGC GAT AAA AAG CTA GAT 8 64 

Glu Thr Trp Val Lys Val Lys Lys Asn Trp Arg Asp Lys Lys Leu Asp 

275 280 285 

60 

TTG GCT GAC TTT GGC TAT AAT GAA AGA GAA TAC TAA 900 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 99 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: 



20 



Met Thr Phe Lys Ser Gly Phe Val Ala lie Leu Gly Arg Pro Asn Val 
15 10 15 

Gly Lys Ser Thr Phe Leu Asn His Val Met Gly Gin Lys lie Ala He 
20 25 30 

Met Ser Asp Lys Ala Gin Thr Thr Arg Asn Lys He Met Gly He Tyr 
35 4 0 4 5 

2 5 Thr Thr Asp Lys Glu Gin He Val Phe He Asp Thr Pro Gly He His 
50 55 60 

Lys Pro Lys Thr Ala Leu Gly Asp Phe Met Val Glu Ser Ala Tyr Ser 
65 70 75 80 

30 

Thr Leu Arg Glu Val Asp Thr Val Leu Phe Met Val Pro Ala Asp Glu 
85 90 95 

Ala Arg Gly Lys Gly Asp Asp Met He He Glu Arq Leu Lys 

35 



4 0 Asp Gin Leu Leu Ser Gin He Asp Asp Phe Arg Asn Gin Met Asp Phe 
130 135 140 

Lys Glu He Val Pro He Ser Ala Leu Gin Gly Asn Asn Val Ser Arg 
145 150 155 160 

45 

Leu Val Asp He Leu Ser Glu Asn Leu Asp Glu Gly Phe Gin Tyr Phe 
165 170 175 



Met Val Arg Glu Lys Val Leu His Leu Thr Arg Glu Glu He Pro His 
195 200 205 

55 Ser Val Ala Val Val Val Asp Ser Met Lys Arg Asp Glu Glu Thr Asp 
210 215 220 

Lys Val His He Arg Ala Thr He Met Val Glu Arg Asp Ser Gin Lys 

225 230 235 240 

60 

Gly He He He Gly Lys Gly Gly Ala Met Leu Lys Lys He Gly Ser 
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(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1011 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..100B 

(D) OTHER INFORMATION: Gcp 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



ATG AAG GAT AGA TAT ATT TTA GCA TTT GAG ACA TCC TGT GAT GAG ACC 

Met Lys Asp Arg Tyr He Leu Ala Phe Glu Thr Ser Cys Asp Glu Thr 

15 10 15 

AGT GTC GCC GTC TTG AAA AAC GAC GAT GAG CTC TTG TCC AAT GTC ATT 

Ser Val Ala Val Leu Lys Asn Asp Asp Glu Leu Leu Ser Asn Val He 
20 25 30 

GCT AGT CAA ATT GAG AGT CAC AAA CGT TTT GGT GGC GTA GTG CCC GAA 

Ala Ser Gin He Glu Ser His Lys Arg Phe Gly Gly Val Val Pro Glu 
35 40 45 

GTA GCC AGT CGT CAC CAT GTC GAG GTC ATT ACA GCC TGT ATC GAG GAG 

Val Ala Ser Arg His His Val Glu Val He Thr Ala Cys He Glu Glu 
50 55 60 

GCA TTG GCA GAA GCA GGG ATT ACC GAA GAG GAC GTG ACA GCT GTT GCG 

Ala Leu Ala Glu Ala Gly He Thr Glu Glu Asp Val Thr Ala Val Ala 

65 70 75 80 

GTT ACC TAC GGA CCA GGC TTG GTC GGA GCC TTG CTA GTT GGT TTG TCA 
Val Thr Tyr Gly Pro Gly Leu Val Gly Ala Leu Leu Val Gly Leu Ser 

85 90 95 

GCT GCC AAG GCC TTT GCT TGG GCT CAC GGA CTT CCA CTG ATT CCT GTT 
Ala Ala Lys Ala Phe Ala Trp Ala His Gly Leu Pro Leu He Pro Val 



WO 98/26072 



PCT/US97/22578 



AAT CAC ATG GCT GGG CAC CTC ATG GCA GCT CAG AGT GTG GAG CCT TTG 
Asn His Met Ala Gly His Leu Met Ala Ala Gin Ser Val Glu Pro Leu 
115 120 125 

GAG TTT CCC TTG CTA GCC CTT TTA GTC AGT GGT GGG CAC ACA GAG TTG 
Glu Phe Pro Leu Leu Ala Leu Leu Val Ser Gly Gly His Thr Glu Leu 
130 135 140 

GTC TAT GTT TCT GAG GCT GGC GAT TAC AAG ATT GTT GGG GAG ACA CGA 
Val Tyr Val Ser Glu Ala Gly Asp Tyr Lys lie Val Gly Glu Thr Arg 
145 150 155 160 

GAC GAT GCA GTT GGG GAG GCT TAT GAC AAG GTC GGT CGT GTC ATG GGC 
Asp Asp Ala Val Gly Glu Ala Tyr Asp Lys Val Gly Arg Val Met Gly 
165 170 175 

TTG ACC TAT CCT GCA GGT CGT GAG ATT GAC GAG CTG GCT CAT CAG GGG 
Leu Thr Tyr Pro Ala Gly Arg Glu lie Asp Glu Leu Ala His Gin Gly 
180 185 190 

CAC GAT ATT TAT GAT TTC CCC CGT GCC ATG ATT AAG GAA GAT AAT CTG 
His Asp He Tyr Asp Phe Pro Arg Ala Met He Lys Glu Asp Asn Leu 
195 200 205 

GAG TTC TCC TTC TCA GGT TTG AAA TCT GCC TTT ATC AAT CTT CAT CAC 
Glu Phe Ser Phe Ser Gly Leu Lys Ser Ala Phe He Asn Leu His His 
210 215 220 

AAT GCC GAG CAA AAG GGA GAA AGC CTG TCT ACA GAA GAT TTG TGT GCT 
Asn Ala Glu Gin Lys Gly Glu Ser Leu Ser Thr Glu Asp Leu Cys Ala 
225 230 235 240 

TCC TTC CAA GCA GCA GTT ATG GAC ATT CTC ATG GCA AAA ACC AAG AAG 
Ser Phe Gin Ala Ala Val Met Asp He Leu Met Ala Lys Thr Lys Lys 
245 250 255 

GCT TTG GAG AAA TAT CCT GTT AAA ACC CTA GTT GTG GCA GGT GGT GTG 
Ala Leu Glu Lys Tyr Pro Val Lys Thr Leu Val Val Ala Gly Gly Val 
260 265 270 

GCA GCC AAT AAA GGT CTC AGA GAA CGC CTA GCA ACT GAA ATC ACA GAT 
Ala Ala Asn Lys Gly Leu Arg Glu Arg Leu Ala Thr Glu He Thr Asp 
275 280 285 

GTC AAT GTT ATC ATT CCA CCT CTG CGT CTC TGC GGA GAC AAT GCA GGT 
Val Asn Val He He Pro Pro Leu Arg Leu Cys Gly Asp Asn Ala Gly 
290 295 300 

ATG ATT GCT TAT GCC AGT GTC AGC GAG TGG AAC AAA GAA AAC TTT GCA 
Met He Ala Tyr Ala Ser Val Ser Glu Trp Asn Lys Glu Asn Phe Ala 
305 310 315 320 

AAC TTG GAC CTC AAT GCC AAA CCA AGT CTT GCC TTT GAT ACC ATG GAA 
Asn Leu Asp Leu Asn Ala Lys Pro Ser Leu Ala Phe Asp Thr Met Glu 

325 330 335 
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(2) INFORMATION FOR SEQ ID NO: 94: 

ill SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 
5 (B; TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 : 

Met Lys Asp Arg Tyr lie Leu Ala Phe Glu Thr Ser Cys Asp Glu Thr 
15 10 15 

15 Ser Val Ala Val Leu Lys Asn Asp Asp Glu Leu Leu Ser Asn Val lie 
20 25 30 

Ala Ser Gin lie Glu Ser His Lys Arg Phe Gly Gly Val Val Pro Glu 
35 40 45 

20 

Val Ala Ser Arg His His Val Glu Val He Thr Ala Cys He Glu Glu 
50 55 60 



Val Thr Tyr Gly Pro Gly Leu Val Gly Ala Leu Leu Val Gly Leu Ser 
85 90 95 

3 0 Ala Ala Lys Ala Phe Ala Trp Ala His Gly Leu Pro Leu He Pro Val 
100 105 110 

Asn His Met Ala Gly His Leu Met Ala Ala Gin Ser Val Glu Pro Leu 
115 120 125 

35 

Glu Phe Pro Leu Leu Ala Leu Leu Val Ser Gly Gly His Thr Glu Leu 
130 135 140 

Val Tyr Val Ser Glu Ala Gly Asp Tyr Lys He Val Gly Glu Thr Arg 
40 145 150 155 160 

Asp Asp Ala Val Gly Glu Ala Tyr Asp Lys Val Gly Arg Val Met Gly 
165 170 175 

45 Leu Thr Tyr Pro Ala Gly Arg Glu He Asp Glu Leu Ala His Gin Gly 
160 185 ' 190 

His Asp He Tyr Asp Phe Pro Arg Ala Met He Lys Glu Asp Asn Leu 
195 200 205 

50 

Glu Phe Ser Phe Ser Gly Leu Lys Ser Ala Phe He Asn Leu His His 
210 215 220 

Asn Ala Glu Gin Lys Gly Glu Ser Leu Ser Thr Glu Asp Leu Cys Ala 
55 225 230 235 240 



6 0 Ala Leu Glu Lys Tyr Pro Val Lys Thr Leu Val Val Ala Gly Gly Val 
260 265 270 
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5 Val Asn Val He He Pro Pro Leu Arg Leu Cys Gly Asp Asn Ala Gly 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 95: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 774 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

( B ) LOCATION: 1..771 

(D) OTHER INFORMATION: HI0454 



i) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



ATG ATT TTT GAT ACA CAT ACA CAC TTG AAT GTA GAA GAA TTT GCA GGT 

Met He Phe Asp Thr His Thr Kis Leu Asn Val Glu Glu Phe Ala Gly 

15 10 15 

CGT GAG GCA GAA GAA ATT GCC TTG GCT GCT GAG ATG GGT GTG ACA CAG 

Arg Glu Ala Glu Glu He Ala Leu Ala Ala Glu Met Gly Val Thr Gin 

20 25 30 

ATG AAT ATT GTT GGT TTT GAT AAA CCG ACG ATT GAG CAT GCC TTG GAG 

Met Asn He Val Gly Phe Asp Lys Pro Thr He Glu His Ala Leu Glu 
35 40 45 

TTG GTA GAT GAG TAT GAG CAG CTC TAT GCG ACT ATT GGT TGG CAT CCT 

Leu Val Asp Glu Tyr Glu Gin Leu Tyr Ala Thr He Gly Trp His Pro 



ACA GAA GCT GGT ACT TAT ACA GAG GAA GTT GAG GCT TAC TTG TTG GAT 
Thr Glu Ala Gly Thr Tyr Thr Glu Glu Val Glu Ala Tyr Leu Leu Asp 



AAG TTA AAA CAT TCC AAG GTT GTG GCT TTA GGT GAA ATT GGC TTA GAC 

Lys Leu Lys His Ser Lys Val Val Ala Leu Gly Glu He Gly Leu Asp 
85 90 95 

TAC CAT TGG ATG ACA GCG CCC AAA GAG GTG CAG GAG CAG GTT TTT CGC 
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Tyr Kis Trp Met Thr Ala Pro Lys Glu Val Gin Glu Gin Val Phe Arg 

100 105 110 

CGT CAG ATT CAG CTA TCT AAG GAC TTG GAT TTG CCT TTT GTT GTC CAT 

Arg Gin lie Gin Leu Ser Lys Asp Leu Asp Leu Pro Phe Val Val His 

115 120 125 

ACC CGT GAT GCG CTG GAA GAT ACC TAT GAG ATT ATC AAG AGT GAG GGC 

Thr Arg Asp Ala Leu Glu Asd Thr Tyr Glu He He Lys Ser Glu Gly 
130 " 135 140 

GTT GGT CCT CGT GGT GGT ATC ATG CAT TCA TTT TCA GGG ACG CTT GAG 

Val Gly Pro Arg Gly Gly He Met Kis Ser Phe Ser Gly Thr Leu Glu 

145 150 155 160 

TGG GCA GAG AAG TTT GTG GAT CTT GGT ATG ACC ATT TCC TTC TCA GGA 

Trp Ala Glu Lys Phe Val Asp Leu Gly Met Thr He Ser Phe Ser Gly 

165 170 175 

GTG GTG ACC TTC AAG AAG GCA ACT GAC CTC CAA GAA GCA GCT AAA GAG 

Val Val Thr Phe Lys Lys Ala Thr Asp Leu Gin Glu Ala Ala Lys Glu 

180 185 190 

TTA CCT TTG GAC AAG ATG TTG GTA GAA ACA GAT GCG CCT TAC TTA GCA 

Leu Pro Leu Asp Lys Met Leu Val Glu Thr Asp Ala Pro Tyr Leu Ala 

195 " 200 205 

CCT GTA CCC AAG CGT GGT CGT GAA AAT AAA ACA GCC TAT ACT CGC TAT 

Pro Val Pro Lys Arg Gly Arg Glu Asn Lys Thr Ala Tyr Thr Arg Tyr 
210 215 220 

GTG GTC GAC TTT ATC GCT GAC TTG CGT GGT ATG ACG ACA GAA GAG CTG 

Val Val Asp Phe He Ala Asp Leu Arg Gly Met Thr Thr Glu Glu Leu 

225 230 235 240 

GCG GTA GCA ACG ACT GCA AAT GCA GAA CGC ATT TTT GGA TTG GAC AGC 

Ala Val Ala Thr Thr Ala Asn Ala Glu Arg He Phe Gly Leu Asp Ser 

245 250 255 

AAG TAA 
Lys 



(2) INFORMATION FOR SEQ ID NOr96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 257 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:96: 

Met lie Phe Asp Thr His Thr His Leu Asn Val Glu Glu Phe Ala Gly 
15 10 15 

Arg Glu Ala Glu Glu He Ala Leu Ala Ala Glu Met Gly Val Thr Gin 
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-166- 

Ket Asn lie Val Gly Phe Asp Lys Pro Thr lie Glu His Ala Leu Glu 
35 ' 40 45 

Leu Val Asp Glu Tyr Glu Glr. Leu Tyr Ala Thr He Gly Trp His Pro 

5 50 55 60 

Thr Glu Ala Gly Thr Tyr Thr Glu Glu Val Glu Ala Tyr Leu Leu Asp 
65 " ^ 75 80 

10 Lvs Leu Lys His Ser Lys Val Val Ala Leu Gly Glu He Gly Leu Asp 
85 90 95 

Tyr His Trp Met Thr Ala Pro Lys Glu Val Gin Glu Gin Val Phe Arg 
100 105 HO 

Arq Gin He Gin Leu Ser Lys Asp Leu Asp Leu Pro Phe Val Val His 
115 120 125 

Thr Arg Asp Ala Leu Glu Asp Thr Tyr Glu He He Lys Ser Glu Gly 
20 130 135 140 

Val Gly Pro Arg Gly Gly He Met His Ser Phe Ser Gly Thr Leu Glu 
145 150 155 160 

2 5 Trp Ala Glu Lys Phe Val Asd Leu Gly Met Thr He Ser Phe Ser Gly 
165 170 175 

Val Val Thr Phe Lys Lys Ala Thr Asp Leu Gin Glu Ala Ala Lys Glu 
180 185 190 

30 Leu Pro Leu Asp Lys Met Leu Val Glu Thr Asp Ala Pro Tyr Leu Ala 
195 200 205 

Pro Val Pro Lys Arg Gly Arg Glu Asn Lys Thr Ala Tyr Thr Arg Tyr 
35 210 215 220 

Val Val Asp Phe He Ala Asp Leu Arg Gly Met Thr Thr Glu Glu Leu 
225 230 235 240 

40 Ala Val Ala Thr Thr Ala Asn Ala Glu Arg lie Phe Gly Leu Asp Ser 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(ix) FEATURE: 
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(A) NAME/ KEY : CDS 

(B! LOCATION': 1..1959 

(D) OTHER INFORMATION: Ligase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

ATG AAT AAA AGA ATG AAT GAG TTA GTC GCT TTG CTC AAT CGC TAT GCG 
Met Asr. Lys Arg Met Asn Glu Leu Val Ala Leu Leu Asn Arg Tyr Ala 
1 ' 5 10 15 

ACT GAG TAC TAT ACC AGC GAT AAT CCC TCG GTT TCA GAC AGT GAG TAT 
Thr Glu Tyr Tyr Thr Ser Asp Asn Pro Ser Val Ser Asp Ser Glu Tyr 
20 25 30 

GAC CGC CTT TAC CGT GAG TTG GTC GAG TTA GAA ACT GCT TAT CCA GAG 
Asp Arg Leu Tyr Arg Glu Leu Val Glu Leu Glu Thr Ala Tyr Pro Glu 
3 5 4 0 4 5 

CAA GTG CTA GCA GAC AGT CCG ACT CAT CGT GTT GGT GGC AAG GTT TTA 
Gin Val Leu Ala Asp Ser Pro Thr His Arg Val Gly Gly Lys Val Leu 
50 55 60 

GAT GGT TTT GAA AAA TAC AGT CAT CAG TAT CCT CTT TAT AGT TTG CAG 
Asp Gly Phe Glu Lys Tyr Ser His Gin Tyr Pro Leu Tyr Ser Leu Gin 
65 " 70 " 75 80 

GAT GCT TTT TCA CGT GAG GAG CTA GAT GCT TTT GAT GCG CGT GTT CGT 
Asp Ala Phe Ser Arg Glu Glu Leu Asp Ala Phe Asp Ala Arg Val Arg 
85 90 95 

AAG GAA GTG GCT CAT CCG ACC TAT ATT TGT GAG CTG AAA ATC GAT GGC 
Lys Glu Val Ala His Pro Thr Tyr He Cys Glu Leu Lys He Asp Gly 
100 105 110 

TTA TCT ATC TCG CTG ACT TAT GA^ AAG GGG ATT TTG GTT GCT GGG GTA 
Leu Ser He Ser Leu Thr Tyr Glu Lys Gly He Leu Val Ala Gly Val 
115 120 125 

ACA CGT GGA GAT GGT TCA ATT GGT GAA AAT ATC ACA GAA AAC CTC AAG 
Thr Arg Gly Asp Gly Ser He Gly Glu Asn He Thr Glu Asn Leu Lys 
130 " 135 140 

CGT GTT AAG GAC ATC CCT TTG ACT TTG CCA GAA GAA CTA GAT ATC ACA 
Arg Val Lys Asp He Pro Leu Thr Leu Pro Glu Glu Leu Asp He Thr 
145 150 155 160 

GTT CGT GGG GAA TGT TAC ATG CCA CGC GCT TCC TTT GAC CAA GTT AAC 
Val Arg Gly Glu Cys Tyr Met Pro Arg Ala Ser Phe Asp Gin Val Asn 
165 170 175 

CAA GCG CGC CAA GAA AAT GGA GAG CCT GAA TTT GCT AAT CCT CGT AAT 
Gin Ala Arg Gin Glu Asn Gly Glu Pro Glu Phe Ala Asn Pro Arg Asn 
180 185 190 

GCG GCA GCA GGA ACT CTG CGT CAG TTG GAT ACA GCA GTA GTT GCC AAG 
Ala Ala Ala Gly Thr Leu Arg Gin Leu Asp Thr Ala Val Val Ala Lys 
195 200 205 

CGT AAT CTT GCA ACG TTT CTC TAT CAA GAA GCC AGC CCT TCA ACT CGT 
Arg Asn Leu Ala Thr Phe Leu Tyr Gin Glu Ala Ser Pro Ser Thr Arg 
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GAT AGC CAA GAA AAG GGT TTG AAG TAG CTA GAA CAA CTA GGT TTT GTG 

Asp Ser Gin Glu Lys Gly Leu Lys Tyr Leu Glu Gin Leu Gly Phe Val 
225 230 235 240 

GTC AAT CCT AAG CGA ATC TTG GCT GA^ AAC ATA GAT GAA ATC TGG AAT 

Val Asn Pro Lys Arg lie Leu Ala Glu Asn lie Asp Glu lie Trp Asn 

245 250 255 

TTT ATC CAA GAA GTA GGA CAG GAA CGG GAA A*\T CTG CCT TAC GAT ATT 

Phe lie Gin Glu Val Gly Gin Glu Arg Glu Asn Leu Pro Tyr Asp lie 
260 265 270 

GAT GGA GTG GTA ATC AAG GTC AAC GAC CTA GCA AGT CAA GAA GAA CTT 

Asp Gly Val Val lie Lys Val Asn Asp Leu Ala Ser Gin Glu Glu Leu 
275 ' 280 285 

GGT TTT ACC GTT AAG GCT CCA AAG TGG GCA GTA GCC TAC AAG TTC CCT 

Gly Phe Thr Val Lys Ala Pro Lys Trp Ala Val Ala Tyr Lys Phe Pro 

290 295 300 

GCT GAA GAA AAA GAA GCT CAA CTC TTA TCA GTT GAC TGG ACA GTT GGC 

Ala Glu Glu Lys Glu Ala Gin Leu Leu Ser Val Asp Trp Thr Val Gly 
305 310 315 320 

CGT ACC GGT GTT GTA ACT CCA ACT GCT AAT CTA ACA CCA GTA CAA CTT 

Arg Thr Gly Val Val Thr Pro Thr Ala Asn Leu Thr Pro Val Gin Leu 

325 330 335 

GCC GGT ACG ACT GTT AGC CGT GCG ACC CTG CAC AAT GTA GAT TAT ATT 

Ala Gly Thr Thr Val Ser Arg Ala Thr Leu His Asn Val Asp Tyr lie 
340 345 350 

GCT GAA AAA GAT ATC CGA AAA GAC GAT ACG GTC ATT GTA TAT AAG GCT 

Ala Glu Lys Asp lie Arg Lys Asp Asp Thr Val lie Val Tyr Lys Ala 
355 360 365 

GGT GAC ATC ATC CCT GCC GTT TTA CGT GTG GTA GAG TCC AAA CGG GTT 

Gly Asp lie He Pro Ala Val Leu Arg Val Val Glu Ser Lys Arg Val 
370 375 380 

TCT GAA GAA AAA CTA GAT ATC CCT ACA AAC TGT CCA AGT TGT AAC TCT 

Ser Glu Glu Lys Leu Asp He Pro Thr Asn Cys Pro Ser Cys Asn Ser 
385 ' 390 395 400 

GAC TTG TTG CAC TTT GAA GAT GAA GTG GCC CTA CGT TGT ATC AAT CCG 

Asp Leu Leu His Phe Glu Asp Glu Val Ala Leu Arg Cys He Asn Pro 

405 * 410 ' 415 

CGT TGC CCT GCT CAA ATC ATG GAA GGC TTG ATT CAC TTT GCT TCT CGT 

Arg Cys Pro Ala Gin He Met Glu Gly Leu He His Phe Ala Ser Arg 
420 425 430 

GAT GCT ATG AA.T ATT ACA GGC CTT GGT CCA TCT ATT GTT GAG AAG CTT 

Asp Ala Met Asn lie Thr Gly Leu Gly Pro Ser He Val Glu Lys Leu 
435 440 445 

TTT GCT GCT AA.T TTA GTC AAG GAT GTG GCG GAT ATT TAT CGT TTG CAA 

Phe Ala Ala Asn Leu Val Lys Asp Val Ala Asp lie Tyr Arg Leu Gin 
450 455 460 
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GAA GAG GAT TTC CTC CTT TTA GAG GGG GTT AAG GAA AAG TCC GCT GCT 

Glu Giu Asp Phe Leu Leu Leu Glu Gly Val Lys Glu Lys Ser Ala Ala 

465 470 475 480 

AAA CTG TAT CAG GCT ATC CAA GCA TCA AAG GAA AAT TCT GCC GAG AAG 

Lys Leu Tyr Gin Ala lie Gin Ala Ser Lys Glu Asn Ser Ala Glu Lys 

485 490 495 

CTC TTA TTT GGT TTG GGA ATT CGT CAT GTC GGA AGC AAG GCT AGT CAG 

Leu Leu Phe Gly Leu Gly lie Arg His Val Gly Ser Lys Ala Ser Gin 

500 505 510 

CTT TTA CTT CAA TAT TTC CAT TCA ATT GAA AAT CTG TAT CAG GCA GAT 

Leu Leu Leu Gin Tyr Phe His Ser lie Glu Asn Leu Tyr Gin Ala Asp 

515 520 525 

TCA GAG GAA GTG GCT AGT ATT GAA AGT CTA GGT GGC GTG ATT GCC AAA 

Ser Glu Glu Val Ala Ser lie Glu Ser Leu Gly Gly Val He Ala Lys 

530 535 540 

AGT CTT CAG ACT TAT TTT GCG GCA GAA GGC TCT GAA ATT CTG CTC AGA 

Ser Leu Gin Thr Tyr Phe Ala Ala Glu Gly Ser Glu He Leu Leu Arg 

545 550 555 560 

GAA TTG AAA GAA ACT GGG GTC AAT CTG GAC TAT AAA GGA CAG ACG GTA 

Glu Leu Lys Glu Thr Gly Val Asn Leu Asp Tyr Lys Gly Gin Thr Val 

565 570 575 

GTA GCG GAT GCG GCC TTG TCA GGT TTG ACC GTG GTA TTG ACA GGA AAA 

Val Ala Asp Ala Ala Leu Ser Gly Leu Thr Val Val Leu Thr Gly Lys 

580 585 590 

TTG GAA CGA CTC AAG CGC TCA GAA GCT AAA AGT AAA CTC GAA AGT CTG 

Leu Glu Arg Leu Lys Arg Ser Glu Ala Lys Ser Lys Leu Glu Ser Leu 

595 600 ' 605 

GGT GCC AAA GTG ACA GGT AGT GTT TCT AAA AAG ACC GAC CTC GTC GTG 

Gly Ala Lys Val Thr Gly Ser Val Ser Lys Lys Thr Asp Leu Val Val 

610 615 620 

GTA GGT GCA GAC GCT GGA AGT AAA CTG CAA AAA GCA CAA GAA CTT GGT 

Val Gly Ala Asp Ala Gly Ser Lys Leu Gin Lys Ala Gin Glu Leu Gly 

625 630 635 640 

ATC CAG GTC AGA GAT GAG GCA TGG CTA GAA AGT TTG TAA 

He Gin Val Arg Asp Glu Ala Trp Leu Glu Ser Leu 

645 650 



(2! INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 653 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
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-172- 

Met Asr. Lys Arg Met Asn Glu Leu Val Ala Leu Leu Asr. Arg Tyr Ala 
15 10 15 

Thr Glu Tyr Tyr Thr Ser Asp Asr. Pro Ser Val Ser Asp Ser Glu Tyr 
5 20 25 30 

Asp Arg Leu Tyr Arg Glu Leu Val Glu Leu Glu Thr Ala Tyr Pro Glu 
35 40 45 

10 Gin Val Leu Ala Asp Ser Pro Thr His Arg Val Gly Gly Lys Val Leu 
50 55 60 

Asp Gly Phe Glu Lys Tyr Ser His Gin Tyr Pro Leu Tyr Ser Leu Gin 
65 70 75 80 

15 

Asp Ala Phe Ser Arg Glu Glu Leu Asp Ala Phe Asp Ala Arg Val Arg 
85 90 95 

Lys Glu Val Ala His Pro Thr Tvr lie 

20 



25 Thr Arg Gly Asp Gly Ser He Gly Glu Asn He Thr Glu Asn Leu Lys 
130 135 140 



40 Arg Asn Leu Ala Thr Phe Leu Tyr Gin Glu Ala Ser Pro Ser Thr Arg 

210 215 220 

Asp Ser Gin Glu Lys Gly Leu Lys Tyr Leu Glu Gin Leu Gly Phe Val 

225 230 ' 235 240 



55 Gly Phe Thr Val Lys Ala Pro Lys Trp Ala Val Ala Tyr Lys Phe Pro 
290 295 300 



Arg Thr Gly Val Val Thr Pro Thr Ala Asn Leu Thr Pro Val Gin Leu 
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325 330 335 

Ala Gly Thr Thr Val Ser Arg Ala Thr Leu His Asn Val Asd Tyr He 
340 345 350 

5 

Ala Glu Lys Asp He Arg Lys Asp Asp Thr Val He Val Tyr Lys Ala 
355 360 365 

Gly Asp He He Pro Ala Val Leu Arg Val Val Glu Ser Lys Arg Val 
10 370 375 380 



15 Asp Leu Leu Kis Phe Glu Asp Glu Val Ala Leu Arg Cys He Asn Pro 
405 410 415 

Arg Cys Pro Ala Gin He Met Glu Gly Leu He His Phe Ala Ser Arg 
420 425 430 

20 

Asp Ala Met Asn He Thr Gly Leu Gly Pro Ser He Val Glu Lys Leu 
435 440 445 



3 0 Lys Leu Tyr Gin Ala He Gin Ala Ser Lys Glu Asn Ser Ala Glu Lys 
485 490 495 



4 5 Glu Leu Lys Glu Thr Gly Val Asn Leu Asd Tyr Lys Gly Gin Thr Val 

565 570 ' " 575 

Val Ala Asp Ala Ala Leu Ser Gly Leu Thr Val Val Leu Thr Gly Lys 

580 585 590 

50 

Leu Glu Arg Leu Lys Arg Ser Glu Ala Lys Ser Lys Leu Glu Ser Leu 

595 600 605 

Gly Ala Lys Val Thr Gly Ser Val Ser Lys Lys Thr Asp Leu Val Val 

55 610 615 620 

Val Gly Ala Asp Ala Gly Ser Lys Leu Gin Lys Ala Gin Glu Leu Gly 

625 630 635 640 

60 He Gin Val Arg Asp Glu Ala Trp Leu Glu Ser Leu 

645 650 
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(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 981 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(IV) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..981 

(D) OTHER INFORMATION: MraY 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

ATG TTT ATT TCC ATC AGT GCT GGA ATT GTG ACA TTT TTA CTA ACT TTA 
Met Phe He Ser He Ser Ala Gly He Val Thr Phe Leu Leu Thr Leu 
15 10 15 

GTA GGA ATT CCG GCC TTT ATC CAA TTT TAT AGA AAG GCG CAA ATT ACA 
Val Gly He Pro Ala Phe He Gin Phe Tyr Arg Lys Ala Gin He Thr 
20 25 30 

GGC CAG CAG ATG CAT GAG GAT GTC AAA CAG CAT CAG GCA AAA GCT GGG 
Gly Gin Gin Met His Glu Asd Val Lys Gin His Gin Ala Lys Ala Gly 
35 * 40 45 

ACT CCT ACA ATG GGA GGT TTG GTT TTC TTG ATT ACT TCT GTT TTG GTT 
Thr Pro Thr Met Gly Gly Leu Val Phe Leu He Thr Ser Val Leu Val 
50 55 60 

GCT TTC TTT TTC GCC CTA TTT AGT AGC CAA TTC AGC AAT AAT GTG GGA 
Ala Phe Phe Phe Ala Leu Phe Ser Ser Gin Phe Ser Asn Asn Val Gly 



ATG ATT TTG TTC ATC TTG GTC TTG TAT GGC TTG GTC GGA TTT TTA GAT 
Met He Leu Phe He Leu Val Leu Tyr Gly Leu Val Gly Phe Leu Asp 



GAC TTT CTC AAG GTC TTT CGT AAA ATC AAT GAG GGG CTT AAT CCT AAG 

Asp Phe Leu Lys Val Phe Arg Lys He Asn Glu Gly Leu Asn Pro Lys 
100 105 110 

CAA AAA TTA GCT CTT CAG CTT CTA GGT GGA GTT ATC TTC TAT CTT TTC 

Gin Lys Leu Ala Leu Gin Leu Leu Gly Gly Val He Phe Tyr Leu Phe 
115 120 ' 125 

TAT GAG CGC GGT GGC GAT ATC CTG TCT GTC TTT GGT TAT CCA GTT CAT 

Tyr Glu Arg Gly Gly Asp He Leu Ser Val Phe Gly Tyr Pro Val His 

130 135 140 

TTG GGA TTT TTC TAT ATT TTC TTC GCT CTT TTC TGG CTA GTC GGT TTT 

Leu Gly Phe Phe Tyr He Phe Phe Ala Leu Phe Trp Leu Val Gly Phe 
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: TTG ACA GAC GGT GTT GAC GGT TTA GCT AGT ATT 

i Leu Thr Asp Gly Val Asp Gly Leu Ala Ser lie 

165 170 175 

TCC GTT GTG ATT AGT TTG TTT GCC TAT GGA GTT ATT GCC TAT GTG CA^ 

Ser Val Val He Ser Leu Phe Ala Tyr Gly Val He Ala Tyr Val Gin 

180 185 190 

GGT CAG ATG GAT ATT CTT CTA GTG ATT CTT GCC ATG ATT GGT GGT TTG 

Gly Gin Met Asp He Leu Leu Val He Leu Ala Met He Gly Gly Leu 

195 ' 200 205 

CTC GGT TTC TTC ATC TTT AAC CAT AAG CCT GCC AAG GTC TTT ATG GGT 

Leu Gly Phe Phe He Phe Asn His Lys Pro Ala Lys Val Phe Met Gly 

210 215 220 

GAT GTG GGA AGT TTG GCC CTA GGT GGG ATG CTG GCA GCT ATC TCT ATG 

Asp Val Gly Ser Leu Ala Leu Gly Gly Met Leu Ala Ala He Ser Met 

225 230 235 240 

GCT CTC CAC CAG GAA TGG ACT CTC TTG ATT ATC GGA ATT GTG TAT GTT 

Ala Leu His Gin Glu Trp Thr Leu Leu He He Gly He Val Tyr Val 

245 250 255 

TTT GAA ACA ACT TCT GTT ATG ATG CAA GTC AGT TAT TTC AAA CTG ACA 

Phe Glu Thr Thr Ser Val Met Met Gin Val Ser Tyr Phe Lys Leu Thr 

260 265 270 

GGT GGT AAA CGT ATT TTC CGT ATG ACG CCT GTA CAT CAC CAT TTT GAG 

Gly Gly Lys Arg He Phe Arg Met Thr Pro Val His His His Phe Glu 

275 280 285 

CTT GGG GGA TTG TCT GGT AAA GGA AAT CCT TGG AGC GAG TGG AAG GTT 

Leu Gly Gly Leu Ser Gly Lys Gly Asn Pro Trp Ser Glu Trp Lys Val 

290 295 300 

GAC TTC TTC TTT TGG GGA GTT GGG CTT CTA GCA AGT CTC CTG ACC CTC 

Asp Phe Phe Phe Trp Gly Val Gly Leu Leu Ala Ser Leu Leu Thr Leu 

305 310 315 320 

GCA ATT TTG TAT TTG ATG TAA 
Ala He Leu Tyr Leu Met 
325 



(2) INFORMATION FOR SEQ ID NO: 100: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 326 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

55 

(ii! MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:100: 
60 Met Phe He Ser He Ser Ala Gly He Val Thr Phe Leu Leu Thr Leu 
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Val Gly lie Pro Ala Phe He Gin Phe Tyr Arg Lys Ala Gin He Thr 

2C 25 30 

Gly Gin Gin Met His Glu Asp Val Lvs Gin His Gin Ala Lys Ala Gly 

35 40 " 45 

Thr Pro Thr Met Gly Gly Leu Val Phe Leu He Thr Ser Val Leu Val 

50 55 60 

Ala Phe Phe Phe Ala Leu Phe Ser Ser Gin Phe Ser Asn Asn Val Gly 

65 70 75 80 

Met He Leu Phe He Leu Val Leu Tyr Gly Leu Val Gly Phe Leu Asp 

85 90 95 



2 0 Gin Lys Leu Ala Leu Gin Leu Leu Gly Gly Val He Phe Tyr Leu Phe 
115 120 125 

Tyr Glu Arg Gly Gly Asp He Leu Ser Val Phe Gly Tyr Pro Val His 
130 135 140 

25 

Leu Gly Phe Phe Tyr He Phe Phe Ala Leu Phe Trp Leu Val Gly Phe 
145 150 155 160 

Ser Asn Ala Val Asn Leu Thr Asp Gly Val Asp Gly Leu Ala Ser He 
30 165 170 175 



35 Gly Gin Met Asp He Leu Leu Val lie Leu Ala Met He Gly Gly Leu 
195 200 205 



Asp Val Gly Ser Leu Ala Leu Gly Gly Met Leu Ala Ala He Ser Met 
225 230 235 240 

Ala Leu His Gin Glu Trp Thr Leu Leu He He Gly He Val Tyr Val 
45 245 250 255 

Phe Glu Thr Thr Ser Val Met Met Gin Val Ser Tyr Phe Lys Leu Thr 
260 265 270 

50 Gly Gly Lys Arg lie Phe Arg Met Thr Pro Val His His His Phe Glu 
275 280 285 



Ala He Leu Tyr Leu Met 
325 
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(2) INFORMATION FOR SEQ ID NO: 101: 

(l] SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 369 base pair 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomi 
(iii) HYPOTHETICAL: NC 
(iv) ANTI-SENSE : NO 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..366 

(D) OTHER INFORMATION: Dpj 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

ATG AGA ATG ATA GTT GGA CAC GGA ATT GAC ATC GAA GAA TTG GCT TCG 

Met Arg Met lie Val Gly His Gly lie Asp He Glu Glu Leu Ala Ser 

1 5 10 15 

ATA GAA AGC GCA GTT ACA CGA CAT GAA GGA TTT GCT AAG CGT GTA CTG 

He Glu Ser Ala Val Thr Arg His Glu Gly Phe Ala Lys Arg Val Leu 

20 25 30 

ACC GCT CAG GAA ATG GAG CGC TTC ACC AGT CTC AAA GGA CGC AGG CAA 

Thr Ala Gin Glu Met Glu Arg Phe Thr Ser Leu Lys Gly Arg Arg Gin 

35 40 45 

ATA GAA TAT TTA GCT GGT CGC TGG TCG GCT AAG GAG GCC TTT TCC AAG 

He Glu Tyr Leu Ala Gly Arg Trp Ser Ala Lys Glu Ala Phe Ser Lys 

50 55 60 

GCT ATG GGA ACG GGC ATT AGC AAG CTC GGT TTT CAG GAT TTG GAA GTC 

Ala Met Gly Thr Gly He Ser Lys Leu Gly Phe Gin Asp Leu Glu Val 

65 70 75 80 

TTG AAC AAT GAA CGT GGG GCG CCT TAT TTT AGT CAG GCA CCA TTT TCA 

Leu Asn Asn Glu Arg Gly Ala Pro Tyr Phe Ser Gin Ala Pro Phe Ser 

65 90 95 

GGA AAG ATT TGG CTG TCT ATC AGC CAC ACC GAT CAG TTT GTG ACA GCC 

Gly Lys He Trp Leu Ser He Ser His Thr Asd Gin Phe Val Thr Ala 

100 105 110 

AGT GTC ATT TTG GAG GAA AAT CAT GAA AGC TAG 

Ser Val He Leu Glu Glu Asn His Glu Ser 

115 120 



(2) INFORMATION FOR SEQ ID NO: 102: 
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acid 

(ii) MOLECULE TYPE: protein 

(xi! SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Arg Met He Val Cly His Gly He Asp He Glu Glu Leu Ala Ser 
15 10 15 



15 



He Glu Ser Ala Val Thr Arg His Glu Gly Phe Ala Lys Arg Val Leu 

20 25 30 

Thr Ala Gin Glu Met Glu Arg Phe Thr Ser Leu Lys Gly Arg Arg Gin 

35 40 45 

He Glu Tyr Leu Ala Gly Arg Trp Ser Ala Lys Glu Ala Phe Ser Lys 

50 55 60 

2 0 Ala Met Gly Thr Gly He Ser Lys Leu Gly Phe Gin Asp Leu Glu Val 

65 70 75 80 

Leu Asn Asn Glu Arg Gly Ala Pro Tyr Phe Ser Gin Ala Pro Phe Ser 



25 



Gly Lys He Trp Leu Ser He Ser His Thr Asp Gin Phe Val Thr Ala 
100 105 110 

Ser Val He Leu Glu Glu Asn His Glu Ser 
115 120 



(2) INFORMATION FOR SEQ ID NO:103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1260 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
50 (B) LOCATION: 1..1260 

(D) OTHER INFORMATION: MurZ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

ATG AGA AAA ATT GTT ATC AAT GGT GGA TTA CCA CTG CAA GGT GAA ATC 4 8 

Met Arg Lys He Val He Asn Gly Gly Leu Pro Leu Gin Gly Glu He 
15 10 15 

60 ACT ATT AGT GGT GCT AAA AAT AGT GTC GTT GCC TTA ATT CCA GCT ATT 96 
Thr He Ser Gly Ala Lys Asn Ser Val Val Ala Leu He Pro Ala He 
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ATC TTG GCT GAT GAT GTG GTG ACT TTG GAT TGC GTT CCA GAT ATT TCG 

Ile Leu Ala Asp Asp Val Val Thr Leu Asp Cys Val Pro Asp He Ser 

35 40 45 

GAT GTA GCC AGT CTT GTC GAA ATC ATG GAA TTG ATG GGA GCT ACT GTT 

Asp Val Ala Ser Leu Val Glu He Met Glu Leu Met Gly Ala Thr Val 
50 55 60 

AAG CGT TAT GAC GAT GTA TTG GAG ATT GAC CCA AGA GGT GTT CAA AAT 

Lys Arg Tyr Asp Asp Val Leu Glu He Asp Pro Arg Gly Val Gin Asn 

65 ' 70 75 80 

ATT CCA ATG CCT TAT GGT AAA ATT AAC AGT CTT CGT GCA TCT TAC TAT 

He Pro Met Pro Tyr Gly Lys He Asn Ser Leu Arg Ala Ser Tyr Tyr 



TTT TAT GGG AGC CTC TTA GGC CGT TTT GGT GAA GCG ACA GTT GGT CTA 

Phe Tyr Gly Ser Leu Leu Gly Arg Phe Gly Glu Ala Thr Val Gly Leu 

100 105 110 

CCG GGA GGA TGT GAT CTT GGT CCT CGT CCG ATT GAC TTA CAC CTT AAG 

Pro Gly Gly Cys Asp Leu Gly Pro Arg Pro He Asp Leu His Leu Lys 

115 120 125 

GCG TTT GAA GCT ATG GGT GCC ACT GCT AGC TAC GAG GGA GAT AAC ATG 

Ala Phe Glu Ala Met Gly Ala Thr Ala Ser Tyr Glu Gly Asp Asn Met 

130 135 140 

AAG TTA TCT GCT AAA GAT ACA GGA CTT CAT GGT GCA AGT ATT TAC ATG 

Lys Leu Ser Ala Lys Asp Thr Gly Leu His Gly Ala Ser He Tyr Met 

145 150 155 160 

GAT ACG GTT AGT GTG GGA GCA ACG ATT AAT ACG ATG ATT GCT GCG GTT 

Asp Thr Val Ser Val Gly Ala Thr He Asn Thr Met He Ala Ala Val 

165 170 175 

AAA GCA AAT GGT CGT ACT ATT ATT GAA AAT GCA GCC CGT GAA CCT GAG 

Lys Ala Asn Gly Arg Thr He He Glu Asn Ala Ala Arg Glu Pro Glu 

180 185 190 

ATT ATT GAT GTA GCT ACT CTC TTG AAT AAT ATG GGT GCC CAT ATC CGT 

He He Asp Val Ala Thr Leu Leu Asn Asn Met Gly Ala Kis He Arg 

195 200 205 

GGG GCA GGA ACT AAT ATC ATC ATT ATT GAT GGT GTT GAA AGA TTA CAT 

Gly Ala Gly Thr Asn He He He He Asp Gly Val Glu Arg Leu His 

210 215 220 

GGG ACA CGT CAT CAG GTG ATT CCA GAC CGC ATT GAA GCT GGA ACA TAT 

Gly Thr Arg His Gin Val He Pro Asd Arg He Glu Ala Gly Thr Tyr 

225 230 235 240 

ATA TCT TTA GCT GCT GCA GTT GGT AAA GGA ATT CGT ATA AAT AAT GTT 

He Ser Leu Ala Ala Ala Val Gly Lys Gly He Arg He Asn Asn Val 

245 250 255 

CTT TAC GAA CAC CTG GAA GGG TTT GTT GCT AAG TTG GAA GAA ATG GGA 

Leu Tyr Glu Kis Leu Glu Gly Phe Val Ala Lys Leu Glu Glu Met Gly 

260 265 270 



PCT/US97/22578 



GTG AGA ATG ACT GTA TCT GAA GAC AGC ATT TTT GTC GAG GAA CAG TCT 8 64 

Val Arg Met Thr Val Ser Glu Asp Ser lie Phe Val GIu Glu Gin Ser 

275 28*0 285 

5 

AAT TTG AAA GCA ATC AAT ATT AAG ACA GCT CCT TAC CCA GGC TTT GCA 912 

Asn Leu Lys Ala He Asn He Lys Thr Ala Pro Tyr Pro Gly Phe Ala 

290 295 300 

10 ACT GAT TTG CAA CAA CCG CTT ACC CCT CTT TTA CTA AGA GCG AAT GGT 960 
Thr Asp Leu Gin Gin Pro Leu Thr Pro Leu Leu Leu Arg Ala Asn Gly 
305 310 315 320 

CGT GGT ACA ATT GTC GAT ACG ATT TAC GAA AAA CGT GTA AAT CAT GTT 1008 
15 Arg Gly Thr He Val Asp Thr He Tyr Glu Lys Arg Val Asn His Val 
325 330 335 

TTT GAA CTA GCA AAG ATG GAT GCG GAT ATT TCG ACA ACA AAT GGT CAT 105 6 

Phe Glu Leu Ala Lys Met Asp Ala Asp He Ser Thr Thr Asn Gly His 
20 340 345 350 

ATT TTG TAC ACG GGT GGA CGT GAT TTA CGT GGG GCC AGT GTT AAA GCG 1104 

He Leu Tyr Thr Gly Gly Arg Asp Leu Arg Gly Ala Ser Val Lys Ala 

355 ' 360 365 

25 

ACC GAC TTA AGA GCT GGG GCT GCA CTA GTC ATT GCT GGG CTT ATG GCT 1152 

Thr Asp Leu Arg Ala Gly Ala Ala Leu Val He Ala Gly Leu Met Ala 

370 375 380 

3 0 GAA GGC AAA ACT GAA ATT ACC AAT ATC GAG TTT ATC TTA CGT GGT TAT 1200 

Glu Gly Lys Thr Glu He Thr Asn He Glu Phe He Leu Arg Gly Tyr 
385 390 395 400 

TCT GAT ATT ATC GAA AAA TTA CGT AAT TTA GGA GCG GAT ATT AGA CTT 12 4 8 

35 Ser Asp He He Glu Lys Leu Arg Asn Leu Gly Ala Asp He Arg Leu 
405 410 415 

GTT GAG GAT TAA 126 0 

Val Glu Asd 

4 0 419 



(2) INFORMATION FOR SEQ ID NO: 104: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 419 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Met Arg Lys He Val He Asn Gly Gly Leu Pro Leu Gin Gly Glu He 
55 1 5 10 15 

Thr He Ser Gly Ala Lys Asn Ser Val Val Ala Leu He Pro Ala He 
20 25 30 

60 He Leu Ala Asp Asp Val Val Thr Leu Asp Cys Val Pro Asp He Ser 
35 40 45 
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Asp Val Ala Ser Leu Val Glu He Met Glu Leu Met Gly Ala Thr Val 

50 55 60 

Lys Arg Tyr Asp Asp Val Leu Glu He Asp Pro Arg Gly Val Gin Asn 

65 70 ' 75 80 

He Pro Met Pro Tyr Gly Lys He Asn Ser Leu Arg Ala Ser Tyr Tyr 

85 90 95 



Pro Gly Gly Cys Asp Leu Gly Pro Arg Pro He Asp Leu Kis Leu Lys 
115 120 ' ' 125 



2 0 Lys Leu Ser Ala Lys Asp Thr Gly Leu His Gly Ala Ser He Tyr Met 

145 150 155 160 

Asp Thr Val Ser Val Gly Ala Thr lie Asn Thr Met He Ala Ala Val 
165 170 175 

25 

Lys Ala Asn Gly Arg Thr He He Glu Asn Ala Ala Arg Glu Pro Glu 
180 185 190 

He He Asp Val Ala Thr Leu Leu Asn Asn Met Gly Ala His He Arg 

3 0 195 200 205 



3 5 Gly Thr Arg His Gin Val He Pro Asp Arg He Glu Ala Gly Thr Tyr 

225 230 235 240 

He Ser Leu Ala Ala Ala Val Gly Lys Gly He Arg He Asn Asn Val 
245 ' 250 255 

40 

Leu Tyr Glu His Leu Glu Gly Phe Val Ala Lys Leu Glu Glu Met Gly 

260 265 ' 270 

Val Arg Met Thr Val Ser Glu Asp Ser He Phe Val Glu Glu Gin Ser 

45 275 280 285 



5 0 Thr Asp Leu Gin Gin Pro Leu Thr Pro Leu Leu Leu Arg Ala Asn Gly 

305 310 315 320 

Arg Gly Thr He Val Asp Thr He Tyr Glu Lys Arg Val Asn Kis Val 

325 330 335 



He Leu Tyr Thr Gly Gly Arg Asp Leu Arg Gly Ala Ser Val Lys Ala 
355 360 365 
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Thr Asp Lei; Arg Ala Gly Ala Ala Leu Val lie Ala Gly Leu Met Ala 
370 375 380 

Glu Gly Lys Thr Glu He Thr Asn He Glu Phe He Leu Arg Gly Tyr 
5 385 390 395 400 

Ser Asd He He Glu Lys Leu Arg Asn Leu Gly Ala Asp He Arg Leu 
405 410 415 

10 Val Glu Asp 
419 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(IV) ANTI-SENSE: NO 



(ix) FEATURE: 
3 0 (A) NAME/KEY: CDS 

(B) LOCATION: 1..1008 

(D) OTHER INFORMATION: FtsZ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: 

ATG ACA TTT TCA TTT GAT ACA GCT GCT GCT CAA GGG GCA GTG ATT AAA 
Met Thr Phe Ser Phe Asp Thr Ala Ala Ala Gin Gly Ala Val He Lys 
1 5 10 15 

GTA ATT GGT GTC GGT GGA GGT GGT GGC AAT GCC ATC AAC CGT ATG GTC 
Val He Gly Val Gly Gly Gly Gly Gly Asn Ala He Asn Arg Met Val 
20 * 25 30 

GAC GAA GGT GTT ACA GGC GTA GAA TTT ATC GCA GCA AAC ACA GAT GTA 
Asp Glu Gly Val Thr Gly Val Glu Phe He Ala Ala Asn Thr Asp Val 
35 40 45 

CAA GCA TTG AGT AGT ACA AAA GCT GAG ACT GTT ATT CAG TTG GGA CCT 
Gin Ala Leu Ser Ser Thr Lys Ala Glu Thr Val He Gin Leu Gly Pro 
50 55 60 

AAA TTG ACT CGT GGT TTG GGT GCA GGA GGT CAA CCT GAG GTT GGT CGT 
Lys Leu Thr Arg Gly Leu Gly Ala Gly Gly Gin Pro Glu Val Gly Arg 
65 70 75 80 

AAA GCC GCT GAA GAA AGC GAA GAA ACA CTG ACG GAA GCT ATT AGT GGT 
Lys Ala Ala Glu Glu Ser Glu Glu Thr Leu Thr Glu Ala He Ser Gly 
85 90 95 

GCC GAT ATG GTC TTC ATC ACT GCT GGT ATG GGA GGA GGC TCT GGA ACT 
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Ala Asp Met Val Phe lie Thr Ala 



GGA GCT GCT CCT GTT ATT GCT CGT ATC GCC AAA GAT TTA GGT GCG CTT 

Gly Ala Ala Pro Val lie Ala Arg lie Ala Lys Asp Leu Gly Ala Leu 
115 120 125 

ACA GTT GGT GTT GTA ACA CGT CCC TTT GGT TTT GAA GGA AGT AAG CGT 

Thr Val Gly Val Val Thr Arg Pro Phe Gly Phe Glu Gly Ser Lys Arg 
130 135 140 

GGA CAA TTT GCT GTA GAA GGA ATC AAT CAA CTT CGT GAG CAT GTA GAC 

Gly Gin Phe Ala Val Glu Gly He Asn Gin Leu Arg Glu His Val Asp 
145 150 155 160 

ACT CTA TTG ATT ATC TCA AAC AAC AAT TTG CTT GAA ATT GTT GAT AAG 

Thr Leu Leu He He Ser Asn Asn Asn Leu Leu Glu He Val Asp Lys 

165 170 175 

AAA ACA CCG CTT TTG GAG GCT CTT AGC GAA GCG GAT AAC GTT CTT CGT 

Lys Thr Pro Leu Leu Glu Ala Leu Ser Glu Ala Asp Asn Val Leu Arg 
180 185 190 

CAA GGT GTT CAA GGG ATT ACC GAT TTG ATT ACC AAT CCA GGA TTG ATT 

Gin Gly Val Gin Gly He Thr Asp Leu He Thr Asn Pro Gly Leu He 
195 " 200 205 

AAC CTT GAC TTT GCC GAT GTG AAA ACG GTA ATG GCA AAC AAA GGG AAT 

Asn Leu Asp Phe Ala Asp Val Lys Thr Val Met Ala Asn Lys Gly Asn 
210 215 220 

GCT CTT ATG GGT ATT GGT ATC GGT AGT GGA GAA GAA CGT GTG GTA GAA 

Ala Leu Met Gly He Gly He Gly Ser Gly Glu Glu Arg Val Val Glu 
225 230 235 240 

GCG GCA CGT AAG GCA ATC TAT TCA CCA CTT CTT GAA ACA ACT ATT GAC 

Ala Ala Arg Lys Ala He Tyr Ser Pro Leu Leu Glu Thr Thr He Asp 

245 250 255 

GGT GCT GAG GAT GTT ATC GTC AAC GTT ACT GGT GGT CTT GAC TTA ACC 

Gly Ala Glu Asp Val He Val Asn Val Thr Gly Gly Leu Asp Leu Thr 
260 265 270 

TTG ATT GAG GCA GAA GAG GCT TCA CAA ATT GTG AAC GAG GCA GCA GGT 

Leu He Glu Ala Glu Glu Ala Ser Gin He Val Asn Gin Ala Ala Gly 
275 280 285 

CAA GGA GTG AAC ATC TGG CTC GGT ACT TCA ATT GAT GAA AGT ATG CGT 

Gin Gly Val Asn He Trp Leu Gly Thr Ser He Asp Glu Ser Met Arg 
290 295 300 

GAT GAA ATT CGT GTA ACA GTT GTC GCA ACG GGT GTT CGT CAA GAC CGC 

Asp Glu He Arg Val Thr Val Val Ala Thr Gly Val Arg Gin Asp Arg 
305 310 315 320 

GTA GAA AAG GTT GTG GCT CCA CAA GCT AGA TCA CCG CGC CTA GGA TAA 

Val Glu Lys Val Val Ala Pro Gin Ala Arg Ser Pro Arg Leu Gly * 

325 330 335 



(2) INFORMATION FOR SEQ ID NO: 106: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



Met Thr Phe Ser Phe Asp Thr Ala Ala Ala Gin Gly Ala Val He Lys 
15 10 15 

Val He Gly Val Gly Gly Gly Gly Gly Asn Ala He Asn Arg Met Val 
15 20 25 30 

Asp Glu Gly Val Thr Gly Val Glu Phe He Ala Ala Asn Thr Asp Val 
35 40 45 

2 0 Gin Ala Leu Ser Ser Thr Lys Ala Glu Thr Val He Gin Leu Gly Pro 
50 55 60 

Lys Leu Thr Arg Gly Leu Gly Ala Gly Gly Gin Pro Glu Val Gly Arg 
65 70 75 80 

25 

Lys Ala Ala Glu Glu Ser Glu Glu Thr Leu Thr Glu Ala He Ser Gly 
85 90 95 



35 Thr Val Gly Val Val Thr Arg Pro Phe Gly Phe Glu Gly Ser Lys Arg 
130 135 140 

Gly Gin Phe Ala Val Glu Gly He Asn Gin Leu Arg Glu His Val Asp 
145 150 155 160 

40 

Thr Leu Leu He He Ser Asn Asn Asn Leu Leu Glu He Val Asp Lys 
165 170 175 

Lys Thr Pro Leu Leu Glu Ala Leu Ser Glu Ala Asp Asn Val Leu Arg 
45 180 185 190 

Gin Gly Val Gin Gly lie Thr Asp Leu He Thr Asn Pro Gly Leu He 
195 200 205 

50 Asn Leu Asp Phe Ala Asp Val Lys Thr Val Met Ala Asn Lys Gly Asn 
210 215 220 

Ala Leu Met Gly He Gly He Gly Ser Gly Glu Glu Arg Val Val Glu 
225 230 235 240 

55 

Ala Ala Arg Lys Ala He Tyr Ser Pro Leu Leu Glu Thr Thr He Asp 
245 250 255 
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Gin Gly Val Asn lie Trp Leu Gly Thr Ser lie Asp Glu Se 
5 290 295 300 



(2) INFORMATION FOR SEQ ID NO: 107: 



15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 525 base pair: 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..52 5 

(D) OTHER INFORMATION: grpE 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 



ATG GCC CAA GAT ATA AAA AAT GAA GAA GTA GAA GAA GTT CAA GAA GAG 
Met Ala Gin Asp He Lys Asn Glu Glu Val Glu Glu Val Gin Glu Glu 



GPA GTT GTG GAA ACA GCT GAA GAA ACA ACT CCT GAA A^G TCT GAG TTG 
Glu Val Val Glu Thr Ala Glu Glu Thr Thr Pro Glu Lys Ser Glu Leu 
20 25 30 

GAC TTG GCA AAT GAA CGT GCA GAT GAG TTC GAA AAC AAA TAT CTT CGC 
Asp Leu Ala Asn Glu Arg Ala Asp Glu Phe Glu Asn Lys Tyr Leu Arg 



GCT CAT GCA GAA ATG CAA AAT ATC CAA CGC CGT GCC AAT GAA GAA CGT 

Ala His Ala Glu Met Gin Asn He Gin Arg Arg Ala Asn Glu Glu Arg 
50 55 60 

CAA AAC TTG CAA CGT TAT CGT AGC CAG GAC TTG GCA AAA GCA ATC TTA 

Gin Asn Leu Gin Arg Tyr Arg Ser Gin Asp Leu Ala Lys Ala He Leu 

65 70 75 80 

CCA TCT CTT GAC AAC CTT GAG CGT GCA CTT GCA GTT GAA GGT TTG ACA 

Pro Ser Leu Asp Asn Leu Glu Arg Ala Leu Ala Val Glu Gly Leu Thr 

85 90 95 

GAT GAT GTG AAG AAG GGC TTG GCG ATG GTG CAA GAA AGC TTG ATT CAC 

Asp Asp Val Lys Lys Gly Leu Ala Met Val Gin Glu Ser Leu He His 
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GCT TTG AAA GA^ GAA GGA ATT GAA GAA ATC GCA GCA GAT GGC GAA TTT 

Ala Leu Lys Glu Glu Gly lie Glu Glu He Ala Ala Asp Gly Glu Phe 

115 120 125 

GAC CAT AAC TAC CAT ATG GCC ATC CAA ACT CTC CCA GGA GAC GAT GAA 

Asp His Asr. Tyr His Met Ala He Glr. Thr Leu Pro Gly Asp Asd Glu 

130 135 140 

CAC CCA GTA GAT ACC ATC GCC CAA GTC TTT CAA AAA GGC TAC AAA CTC 

His Pro Val Asp Thr He Ala Glr. Val Phe Gin Lys Gly Tyr Lys Leu 

145 150 155 160 

CAT GAC CGC ATC CTA CGC CCA GCA ATG GTA GTG GTG TAT AAC TAA 

His Asp Arg He Leu Arg Pro Ala Met Val Val Val Tyr Asn * 

165 170 174 



2 0 (2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:108: 

30 

Met Ala Gin Asp He Lys Asn Glu Glu Val Glu Glu Val Gin Glu Glu 
15 10 15 

Glu Val Val Glu Thr Ala Glu Glu Thr Thr Pro Glu Lys Ser Glu Leu 
35 20 25 ' 30 

Asp Leu Ala Asn Glu Arg Ala Asp Glu Phe Glu Asn Lys Tyr Leu Arg 
35 40 45 

40 Ala His Ala Glu Met Gin Asn He Gin Arg Arg Ala Asn Glu Glu Arg 
50 55 60 

Gin Asn Leu Gin Arg Tyr Arg Ser Gin Asp Leu Ala Lys Ala He Leu 
65 70 75 80 

45 

Pro Ser Leu Asp Asn Leu Glu Arg Ala Leu Ala Val Glu Gly Leu Thr 
85 90 95 

Asp Asp Val Lys Lys Gly Leu Ala Met Val Gin Glu Ser Leu He His 
50 100 105 110 

Ala Leu Lys Glu Glu Gly He Glu Glu He Ala Ala Asp Gly Glu Phe 
115 120 125 

55 Asp His Asn Tyr His Met Ala He Gin Thr Leu Pro Gly Asd Asp Glu 
130 135 140 

His Pro Val Asp Thr He Ala Gin Val Phe Gin Lys Gly Tyr Lys Leu 
145 150 155 160 

60 

His Asp Arg He Leu Arg Pro Ala Met Val Val Val Tyr Asn 
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165 170 174 

(2) INFORMATION FOR SEQ ID NO: 109: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5B2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE : 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..582 

(D) OTHER INFORMATION: HI 164 B 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ATG AAA ATC GGA ATA TTG GCC TTG CAA GGG GCC TTT GCA GAA CAT GCA 
Met Lys lie Gly He Leu Ala Leu Gin Gly Ala Phe Ala Glu His Ala 
15 10 15 

AAA GTG CTA GAT CAA TTA GGT GTC GAG AGT GTA GAA CTC AGA AAT CTA 
Lys Val Leu Asn Gin Leu Gly Val Glu Ser Val Glu Leu Arg Asn Leu 
20 25 30 

GAT GAT TTT CAG CAA GAT CAG AGT GAC TTG TCG GGT TTG ATT TTG CCT 
Asp Asp Phe Gin Gin Asp Gin Ser Asp Leu Ser Gly Leu He Leu Pro 
35 40 45 

GGT GGT GAG TCT ACA ACC ATG GGC AAG CTC TTA CGT GAC CAG AAC ATG 
Gly Gly Glu Ser Thr Thr Met Gly Lys Leu Leu Arg Asp Gin Asn Met 
50 55 60 

CTA CTT CCC ATA CGA GAA GCC ATT CTA TCT GGC TTA CCA GTG TTT GGG 
Leu Leu Pro He Arg Glu Ala He Leu Ser Gly Leu Pro Val Phe Gly 
65 70 75 80 

ACC TGT GCG GGC TTA ATT TTG CTG GCT AAG GAA ATC ACT TCT CAG AAA 
Thr Cys Ala Gly Leu He Leu Leu Ala Lys Glu He Thr Ser Gin Lys 
85 90 95 

GAG AGT CAT CTA GGA ACT ATG GAT ATG GTG GTC GAG CGT AAT GCT TAT 
Glu Ser His Leu Gly Thr Met Asp Met Val Val Glu Arg Asn Ala Tyr 
100 105 110 

GGG CGC CAA TTA GGA AGT TTC TAC ACG GAA GCA GAA TGT AAG GGA GTT 
Gly Arg Gin Leu Gly Ser Phe Tyr Thr Glu Ala Glu Cys Lys Gly Val 
115 120 125 

GGC AAG ATT CCA ATG ACC TTT ATC CGT GGT CCG ATT ATC AGT AGT GTT 
Gly Lys lie Pro Met Thr Phe He Arg Gly Pro He He Ser Ser Val 
130 135 140 
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GGT GAS GGT GTA GAA ATT TTA GCA ATA GTG AAC AAT CAA ATT GTT GCA 

Gly Glu Gly Val Glu He Leu Ala He Val Asp. Asn Gin He Val Ala 

145 150 155 160 

GCC CAA GAA AAA AAT ATG TTG GTA AGT TCT TTT CAT CCA GAA TTG ACT 

Ala Gin Glu Lys Asn Met Leu Val Ser Ser Phe His Pro Glu Leu Thr 

165 170 175 

GAT GAT GTG CGC TTG CAC CAG TAC TTT ATC AAT ATG TGT AAA CAA AAA 

Asp Asp Val Arg Leu His Gin Tyr Phe He Asn Met Cys Lys Glu Lys 

1B0 185 190 



(2) INFORMATION FOR SEQ ID NO: 110: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:110: 

Met Lys He Gly He Leu Ala Leu Gin Gly Ala Phe Ala Glu His Ala 
30 1 5 10 15 

Lys Val Leu Asp Gin Leu Gly Val Glu Ser Val Glu Leu Arg Asn Leu 
20 25 30 

3 5 Asp Asp Phe Gin Gin Asp Gin Ser Asp Leu Ser Gly Leu He Leu Pro 
35 40 45 

Gly Gly Glu Ser Thr Thr Met Gly Lys Leu Leu Arg Asp Gin Asn Met 
50 55 60 

40 

Leu Leu Pro He Arg Glu Ala He Leu Ser Gly Leu Pro Val Phe Gly 
65 70 75 80 

Thr Cys Ala Gly Leu He Leu Leu Ala Lys Glu He Thr Ser Gin Lys 
45 85 90 95 

Glu Ser His Leu Gly Thr Met Asp Met Val Val Glu Arg Asn Ala Tyr 
100 105 110 

5 0 Gly Arg Gin Leu Gly Ser Phe Tyr Thr Glu Ala Glu Cys Lys Gly Val 
115 120 125 

Gly Lys He Pro Met Thr Phe He Arg Gly Pro He He Ser Ser Val 
130 135 140 

55 

Gly Glu Gly Val Glu He Leu Ala He Val Asn Asn Gin He Val Ala 
145 150 155 160 

Ala Gin Glu Lys Asn Met Leu Val Ser Ser Phe His Pro Glu Leu Thr 
60 165 170 175 
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Gin Tyr Phe lie Asr. Met Cys Lys Glu Lys 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 546 base pair. 

(B) TYPE: nucleic acid 

(C) STRANDEDNES S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (ger.oma 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NC 



FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1..543 

(D) OTHER INFORMATION: pgsA 



SEQUENCE DESCRIPTION: SEQ ID NO: 111 : 



ATG AAA AAA GAA CAA ATT CCC AAT CTC TTA ACA ATA GGT CGA ATT CTC 
Met Lys Lys Glu Gin He Pro Asn Leu Leu Thr He Gly Arg He Leu 
15 10 15 

TTT ATA CCT ATT TTT ATC TTT ATT TTA ACG ATA GGA AAT TCG ATA GAG 
Phe He Pro He Phe He Phe He Leu Thr lie Gly Asn Ser He Glu 
20 25 30 

AGT CAT ATA GTT GCA GCT ATT ATC TTT GCT GTT GCC AGT ATT ACC GAC 
Ser His He Val Ala Ala He He Phe Ala Val Ala Ser He Thr Asp 
35 40 45 

TAT TTA GAT GGA TAT TTA GCT CGT AAA TGG AAT GTG GTC AGT AAT TTT 
Tyr Leu Asp Gly Tyr Leu Ala Arg Lys Trp Asn Val Val Ser Asn Phe 
50 55 60 

GGT AAA TTT GCA GAT CCT ATG GCG GAT AAG TTA CTA GTT ATG TCG GCT 
Gly Lys Phe Ala Asp Pro Met Ala Asp Lys Leu Leu Val Met Ser Ala 
65 70 75 80 

TTT ATT ATG TTG ATT GAG TTA GGT ATG GCT CCG GCT TGG ATT GTT GCA 
Phe He Met Leu He Glu Leu Gly Met Ala Pro Ala Trp He Val Ala 
85 90 95 

GTG ATT ATC TGT CGT GAG TTA GCT GTG ACA GGT TTA AGG CTT TTA TTG 
Val He He Cys Arg Glu Leu Ala Val Thr Gly Leu Arg Leu Leu Leu 
100 105 110 

GTT GAA ACT GGT GGA ACA ATT TTA GCA GCA GCA ATG CCT GGA AAA ATT 
Val Glu Thr Gly Gly Thr He Leu Ala Ala Ala Met Pro Gly Lys He 
115 120 125 
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AAA ACT TTT AGT CAG ATG TTT GCT ATT ATT TTC TTG CTA TTA CAT TGG 
Lys Thr Phe Ser Gin Met Phe Ala lie He Phe Leu Leu Leu His Trp 
130 135 140 

5 ACT TTG CTT GGT CAA GTT CTA CTT TAT GTA GCC TTA TTT TTC ACT ATC 
Thr Leu Leu Gly Gin Val Leu Leu Tyr Val Ala Leu Phe Phe Thr lie 
145 " 150 155 160 

TAC TCT GGC TAT GAC TAT TTC AAG GGT AGT GCC TAT GTA TTT AAA GGG 
10 Tyr Ser Gly Tyr Asd Tyr Phe Lys Gly Ser Ala Tyr Val Phe Lys Gly 
165 170 175 

ACA TTT GGT TCG AAA TGA 
Thr Phe Gly Ser Lys 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Met Lys Lys Glu Gin He Pro Asn Leu Leu Thr He Gly Arg He Leu 
15 10 15 

Phe He Pro He Phe He Phe He Leu Thr He Gly Asn Ser He Glu 
20 25 30 

Ser His lie Val Ala Ala He He Phe Ala Val Ala Ser He Thr Asp 
35 40 45 

Tyr Leu Asp Gly Tyr Leu Ala Arg Lys Trp Asn Val Val Ser Asn Phe 
50 * 55 60 

Gly Lys Phe Ala Asp Pro Met Ala Asp Lys Leu Leu Val Met Ser Ala 
65 70 75 80 

Phe He Met Leu He Glu Leu Gly Met Ala Pro Ala Trp He Val Ala 
85 ' 90 95 
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(2) INFORMATION FOR SEQ ID NO : 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1224 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1. .1221 

(D) OTHER INFORMATION : RodA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

ATG AAA CGT TCT CTC GAC TCT AGA GTC GAT TAT AGT TTG CTC TTG CCA 

Met Lys Arg Ser Leu Asp Ser Arg Val Asp Tyr Ser Leu Leu Leu Pro 
1 5 10 15 

GTA TTT TTT CTA CTG GTC ATC GGT GTG GTG GCT ATC TAT ATA GCC GTT 

Val Phe Phe Leu Leu Val lie Gly Val Val Ala He Tyr He Ala Val 



AGT CAT GAT TAT CCC AAT AAT ATT CTG CCC ATT TTA GGG CAG CAG GTC 
Ser His Asp Tyr Pro Asn Asn He Leu Pro He Leu Gly Gin Gin Val 
35 40 45 

GCC TGG ATT GCC TTG GGG CTT GTG ATT GGT TTT GTG GTC ATG CTC TTT 
Ala Trp He Ala Leu Gly Leu Val He Gly Phe Val Val Met Leu Phe 
50 55 60 

AAT ACA GAA TTT CTT TGG AAG GTG ACC CCC TTT CTA TAT ATT TTA GGC 
Asn Thr Glu Phe Leu Trp Lys Val Thr Pro Phe Leu Tyr He Leu Gly 
65 70 75 80 

TTG GGA CTT ATG ATC TTG CCG ATT GTA TTT TAT AAT CCA AGC TTA GTT 
Leu Gly Leu Met He Leu Pro He Val Phe Tyr Asn Pro Ser Leu Val 



GCA TCA ACG GGT GCC AAA AAC TGG GTA TCA ATA AAT GGA ATT ACC CTA 

Ala Ser Thr Gly Ala Lys Asn Trp Val Ser He Asn Gly He Thr Leu 

100 105 110 

TTT CAA CCG TCA GAA TTT ATG AAG ATA TCC TAT ATC CTC ATG TTG GCT 

Phe Gin Pro Ser Glu Phe Met Lys He Ser Tyr He Leu Met Leu Ala 

115 120 125 

CGT GTC ATT GTC CAA TTT ACA AAG AAA CAT AAG GAA TGG AGA CGC ACG 

Arg Val He Val Gin Phe Thr Lys Lys His Lys Glu Trp Arg Arg Thr 

130 135 140 
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-192- 



GTT CCG CTG GAC TTT TTG TTA ATT TTC TGG ATG ATT CTC TTT ACC ATT 

Val Pro Leu Asp Phe Leu Leu lie Phe Trp Met lie Leu Phe Thr lie 

145 150 155 160 

CCA GTC CTA GTT CTT TTA GCA CTT CAA AGT GAC TTG GGG ACG GCT TTG 

Pro Val Leu Val Leu Leu Ala Leu Gin Ser Asp Leu Gly Thr Ala Leu 
165 170 175 

GTT TTT GTA GCC ATT TTC TCA GGA ATC GTT TTA TTA TCA GGG GTT TCT 

Val Phe Val Ala He Phe Ser Gly He Val Leu Leu Ser Gly Val Ser 
180 185 190 

TGG AAA ATT ATT ATC CCA GTA TTT GTG ACT GCT GTA ACA GGA GTT GCT 

Trp Lys He He He Pro Val Phe Val Thr Ala Val Thr Gly Val Ala 
195 200 205 

GGT TTC TTA GCT ATC TTT ATT AGC AAG GAC GGA CGA GCT TTT CTT CAC 

Gly Phe Leu Ala He Phe He Ser Lys Asp Gly Arg Ala Phe Leu His 
210 215 220 

CAG ATT GGA ATG CCG ACC TAC CAA ATC AAT CGG ATT TTG GCT TGG CTC 

Gin He Gly Met Pro Thr Tyr Gin He Asn Arg He Leu Ala Trp Leu 

225 230 235 240 

AAT CCC TTT GAG TTT GCC CAA ACA ACG ACT TAC CAG CAG GCT CAA GGG 

Asn Pro Phe Glu Phe Ala Gin Thr Thr Thr Tyr Gin Gin Ala Gin Gly 
245 250 255 

CAG ATT GCC ATT GGG AGT GGT GGC TTA TTT GGT CAG GGA TTT AAT GCT 

Gin He Ala He Gly Ser Gly Gly Leu Phe Gly Gin Gly Phe Asn Ala 
260 265 270 

TCG AAT CTG CTT ATC CCA GTT CGA GAG TCA GAT ATG ATT TTT ACG GTT 

Ser Asn Leu Leu He Pro Val Arg Glu Ser Asp Met lie Phe Thr Val 
275 280 285 

ATT GCA GAA GAT TTT GGC TTT ATT GGC TCT GTC CTG GTT ATT GCC CTC 

He Ala Glu Asp Phe Gly Phe He Gly Ser Val Leu Val He Ala Leu 
290 295 300 

TAT CTC ATG TTG ATT TAC CGT ATG TTG AAG ATT ACT CTT AAA TCA AAT 

Tyr Leu Met Leu He Tyr Arg Met Leu Lys He Thr Leu Lys Ser Asn 

305 310 315 320 

AAC CAG TTC TAC ACT TAT ATT TCC ACA GGT TTG ATT ATG ATG TTG CTC 

Asn Gin Phe Tyr Thr Tyr He Ser Thr Gly Leu He Met Met Leu Leu 
325 330 335 

TTC CAC ATC TTT GAG AAT ATC GGT GCT GTG ACT GGA CTA CTT CCT TTG 

Phe His He Phe Glu Asn He Gly" Ala Val Thr Gly Leu Leu Pro Leu 
340 345 350 

ACG GGG ATT CCC TTG CCT TTC ATT TCG CAA GGG GGA TCA GCG ATT ATC 

Thr Gly He Pro Leu Pro Phe He Ser Gin Gly Gly Ser Ala He He 
355 360 365 

AGT AAT CTG ATT GGT GTT GGT TTG CTT TTA TCG ATG AGT TAC CAG ACT 
Ser Asn Leu He Gly Val Gly Leu Leu Leu Ser Met Ser Tyr Gin Thr 
370 375 380 
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AAT CTA GCT GAA GAA AAG AGC GGA AAA GTC CCA TTC AAA CGG AAA AAG 
Asn Leu Ala Glu Glu Lys Ser Gly Lys Val Pro Phe Lys Arg Lys Lys 
385 290 395 400 

GTT GTA TTA AAA CAA ATT AAA TAA 
Val Val Leu Lys Gin lie Lys 
405 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 07 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

Met Lys Arg Ser Leu Asp Ser Arg Val Asp Tyr Ser Leu Leu Leu Pro 
15 10 15 

Val Phe Phe Leu Leu Val He Gly Val Val Ala He Tyr He Ala Val 
20 25 30 

Ser His Asp Tyr Pro Asn Asn He Leu Pro He Leu Gly Gin Gin Val 
35 40 45 

Ala Trp He Ala Leu Gly Leu Val He Gly Phe Val Val Met Leu Phe 
50 55 60 

Asn Thr Glu Phe Leu Trp Lys Val Thr Pro Phe Leu Tyr He Leu Gly 
65 70 75 80 

Leu Gly Leu Met He Leu Pro He Val Phe Tyr Asn Pro Ser Leu Val 
8 5 90 95 

Ala Ser Thr Gly Ala Lys Asn Trp Val Ser He Asn Gly He Thr Leu 
100 105 110 

Phe Gin Pro Ser Glu Phe Met Lys He Ser Tyr He Leu Met Leu Ala 
115 120 125 

Arg Val He Val Gin Phe Thr Lys Lys Kis Lys Glu Trp Arg Arg Thr 
130 135 140 

Val Pro Leu Asp Phe Leu Leu He Phe Trp Met He Leu Phe Thr He 
145 150 155 160 

Pro Val Leu Val Leu Leu Ala Leu Gin Ser Asp Leu Gly Thr Ala Leu 
165 170 175 

Val Phe Val Ala He Phe Ser Gly He Val Leu Leu Ser Gly Val Ser 
180 185 190 

Trp Lys He He He Pro Val Phe Val Thr Ala Val Thr Gly Val Ala 
195 200 205 

Gly Phe Leu Ala He Phe He Ser Lys Asp Gly Arg Ala Phe Leu His 
210 215 220 
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Gin He Gly Met Pro Thr Tyr Gin He Asn Arg He Leu Ala TrD Leu 

225 230 235 ' 240 

Asn Pro Phe Glu Phe Ala Gin Thr Thr Thr Tyr Gin Gin Ala Gin Gly 

245 250 255 

Gin He Ala He Gly Ser Gly Gly Leu Phe Gly Gin Gly Phe Asn Ala 

260 265 270 



2 0 Asn Gin Phe Tyr Thr Tyr He Ser Thr Gly Leu He Met Met Leu Leu 
325 330 335 

Phe His He Phe Glu Asn lie Gly Ala Val Thr Gly Leu Leu Pro Leu 
340 345 350 

25 

Thr Gly He Pro Leu Pro Phe He Ser Gin Gly Gly Ser Ala He He 
355 360 365 



3 5 Val Val Leu Lys Gin He Lys 

405 

(2) INFORMATION FOR SEQ ID NO: 115: 

4 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1311 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
50 (iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 
55 (B) LOCATION: 1..1311 

(D) OTHER INFORMATION: SecY 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

60 

ATG TTT TTT AAA TTA TTA AGA GAA GCT CTT AAA GTC AAG CAG GTT CGA 4 8 
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Met Phe Phe Lys Leu Leu Arg Glu Ala Leu Lys Val Lys Glr. Val Arg 
15 10 15 

TCA AAA ATT TTA TTT ACA ATT TTT ATC GTT TTG GTC TTT CGT ATC GGA 
Ser Lys lie Leu Phe Thr He Phe He Val Leu Val Phe Arg He Gly 
20 25 30 

ACT AGC ATT ACA GTT CCT GGT GTG AAT GCC AAT AGC TTG AAT GCT TTA 
Thr Ser He Thr Val Pro Gly Val Asn Ala Asn Ser Leu Asn Ala Leu 
35 40 45 

AGT GGA TTA TCC TTC TTA AAC ATG TTG AGC TTG GTG TCG GGG AAT GCC 
Ser Gly Leu Ser Phe Leu Asn Met Leu Ser Leu Val Ser Gly Asn Ala 
50 55 60 

CTA AAA AAC TTT TCG ATT TTT GCC CTA GGA GTT AGT CCC TAT ATC ACC 
Leu Lys Asn Phe Ser He Phe Ala Leu Gly Val Ser Pro Tyr He Thr 
65 70 75 80 

GCT TCT ATT GTT GTC CAA CTC TTG CAA ATG GAT ATT TTA CCC AAG TTT 
Ala Ser He Val Val Gin Leu Leu Gin Met Asp He Leu Pro Lys Phe 
85 90 95 

GTA GAG TGG GGT AAA CAA GGG GAA GTA GGT CGA AGA AAA TTG AAT CAA 
Val Glu Trp Gly Lys Gin Gly Glu Val Gly Arg Arg Lys Leu Asn Gin 
100 105 110 

GCT ACT CGT TAT ATT GCT CTA GTT CTC GCT TTT GTG CAA TCT ATC GGG 
Ala Thr Arg Tyr He Ala Leu Val Leu Ala Phe Val Gin Ser He Gly 
115 120 125 

ATT ACA GCT GGT TTT AAT ACC TTG GCT GGA GCT CAA TTG ATT AAA ACT 
He Thr Ala Gly Phe Asn Thr Leu Ala Gly Ala Gin Leu He Lys Thr 
130 135 140 

GCT TTA ACT CCA CAA GTT TTT CTG ACG ATT GGT ATC ATC TTA ACA GCT 
Ala Leu Thr Pro Gin Val Phe Leu Thr He Gly He He Leu Thr Ala 
145 150 155 160 

GGT AGT ATG ATT GTC ACT TGG TTG GGT GAG CAA ATT ACA GAT AAG GGA 
Gly Ser Met He Val Thr Trp Leu Gly Glu Gin He Thr Asp Lys Gly 
165 170 175 

TAC GGA AAC GGT GTT TCC ATG ATT ATC TTT GCC GGG ATT GTT TCC TCA 
Tyr Gly Asn Gly Val Ser Met He He Phe Ala Gly He Val Ser Ser 
180 185 190 

ATT CCA GAG ATG ATT CAG GGC ATC TAT GTG GAC TAC TTT GTG AAC GTC 
He Pro Glu Met He Gin Gly He Tyr Val Asp Tyr Phe Val Asn Val 
195 200 205 

CCA AGT AGC CGT ATC ACT TCA TCT ATC ATT TTC GTA ATC ATT TTG ATT 
Pro Ser Ser Arg He Thr Ser Ser He He Phe Val He He Leu He 
210 215 220 

ATT ACT GTA TTG TTG ATT ATT TAC TTT ACA ACT TAT GTT CAA CAA GCA 
He Thr Val Leu Leu He He Tyr Phe Thr Thr Tyr Val Gin Gin Ala 
225 230 235 240 

GAA TAC AAA ATT CCA ATC CAA TAT ACT AAG GTT GCA CAA GGT GCT CCA 
Glu Tyr Lys He Pro He Gin Tyr Thr Lys Val Ala Gin Gly Ala Pro 
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TCT AGC TCT TAC CTT CCG TTA AAG GTA AAT CCT GCT GGA GTT ATC CCT 

Ser Ser Ser Tyr Leu Pro Leu Lys Val Asn Pro Ala Gly Val lie Pro 
260 265 270 

GTT ATC TTT GCC AGT TCG ATT ACT GCA GCG CCT GCG GCT ATT CTT CAG 

Val He Phe Ala Ser Ser He Thr Ala Ala Pro Ala Ala He Leu Gin 
275 280 285 

TTT TTG AGT GCC ACA GGT CAT GAT TGG GCT TGG GTA AGG GTA GCA CAA 

Phe Leu Ser Ala Thr Gly His Asp Trp Ala Trp Val Arg Val Ala Gin 
290 295 300 

GAG ATG TTG GCA ACT ACT TCT CCA ACT GGT ATT GCC ATG TAT GCT TTG 

Glu Met Leu Ala Thr Thr Ser Pro Thr Gly He Ala Met Tyr Ala Leu 

305 310 315 320 

TTG ATT ATT CTC TTT ACA TTC TTC TAT ACG TTT GTA CAG ATT AAT CCT 

Leu He He Leu Phe Thr Phe Phe Tyr Thr Phe Val Gin He Asn Pro 
325 330 335 

GAA AAA GCA GCA GAG AGC CTA CAA AAG AGT GGT GCC TAT ATC CAT GGA 

Glu Lys Ala Ala Glu Ser Leu Gin Lys Ser Gly Ala Tyr He His Gly 
340 345 350 

GTT CGT CCT GGT AAA GGT ACA GAA GAA TAT ATG TCT AAA CTT CTT CGT 

Val Arg Pro Gly Lys Gly Thr Glu Glu Tyr Met Ser Lys Leu Leu Arg 
355 360 365 

CGT CTT GCA ACT GTT GGT TCC CTC TTC CTT GGT GTG ATT TCC ATT TTA 

Arg Leu Ala Thr Val Gly Ser Leu Phe Leu Gly Val He Ser He Leu 
370 375 380 

CCG ATT GCA GCT AAA GAT GTA TTT GGT CTT TCT GAT GTT GTT GCC TTT 

Pro He Ala Ala Lys Asp Val Phe Gly Leu Ser Asp Val Val Ala Phe 

385 390 395 400 

GGT GGA ACA AGT CTC TTG ATC ATT ATC TCT ACA GGT ATC GAA GGA ATC 
Gly Gly Thr Ser Leu Leu He He He Ser Thr Gly He Glu Gly He 
405 410 415 

AAG CAA TTG GAA GGT TAC CTA TTG AAA CGT AAG TAT GTT GGT TTC ATG 

Lys Gin Leu Glu Gly Tyr Leu Leu Lys Arg Lys Tyr Val Gly Phe Met 
420 425 430 

GAC AGA ACA GAA TAA 
Asp Arg Thr Glu * 
435 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
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Met Phe Phe Lvs Leu Leu Arg Glu Ala Leu Lys Val Lys Gin Val Arg 
15 10 15 

5 Ser Lys He Leu Phe Thr He Phe He Val Leu Val Phe Arg He Gly 
20 25 30 

Thr Ser He Thr Val Pro Gly Val Asn Ala Asn Ser Leu Asn Ala Leu 
35 40 45 

10 

Ser Gly Leu Ser Phe Leu Asn Met Leu Ser Leu Val Ser Gly Asn Ala 
50 55 60 

Leu Lys Asn Phe Ser He Phe Ala Leu Gly Val Ser Pro Tyr He Thr 
15 65 70 75 80 

Ala Ser He Val Val Gin Leu Leu Gin Met Asp He Leu Pro Lys Phe 
85 90 95 

2 0 Val Glu Trp Gly Lys Gin Gly Glu Val Gly Arg Arg Lys Leu Asn Gin 
100 105 HO 

Ala Thr Arg Tyr He Ala Leu Val Leu Ala Phe Val Gin Ser He Gly 
115 120 125 

25 

He Thr Ala Gly Phe Asn Thr Leu Ala Gly Ala Gin Leu He Lys Thr 
130 135 140 

Ala Leu Thr Pro Gin Val Phe Leu Thr lie Gly He He Leu Thr Ala 
30 145 150 155 160 

Gly Ser Met He Val Thr Trp Leu Gly Glu Gin He Thr Asp Lys Gly 
165 170 175 

35 Tyr Gly Asn Gly Val Ser Met lie He Phe Ala Gly lie Val Ser Ser 
180 185 190 

He Pro Glu Met He Gin Gly He Tyr Val Asp Tyr Phe Val Asn Val 
195 200 205 

40 

Pro Ser Ser Arg He Thr Ser Ser He He Phe Val He He Leu He 
210 215 220 

He Thr Val Leu Leu He lie Tyr Phe Thr Thr Tyr Val Gin Gin Ala 
45 225 230 235 240 

Glu Tyr Lys lie Pro He Gin Tyr Thr Lys Val Ala Gin Gly Ala Pro 
245 250 255 

50 Ser Ser Ser Tyr Leu Pro Leu Lys Val Asn Pro Ala Gly Val lie Pro 
260 265 270 

Val lie Phe Ala Ser Ser He Thr Ala Ala Pro Ala Ala lie Leu Gin 
275 280 285 

55 

Phe Leu Ser Ala Thr Gly His Asp Trp Ala Trp Val Arg Val Ala Gin 
290 295 300 

Glu Met Leu Ala Thr Thr Ser Pro Thr Gly He Ala Met Tyr Ala Leu 
60 305 310 315 320 
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2 5 (2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

3 0 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
3 5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



40 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1956 

(D) OTHER INFORMATION: FtsH 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

ATG AAA AAA CAA AAT AAT GGT TTA ATT AAA AAT CCT TTT CTA TGG TTA 4 8 

Met Lys Lys Gin Asn Asn Gly Leu He Lys Asn Pro Phe Leu Trp Leu 
50 1 5 10 15 

TTA TTT ATC TTT TTC CTT GTG ACA GGA TTC CAG TAT TTC TAT TCT GGG 96 

Leu Phe He Phe Phe Leu Val Thr Gly Phe Gin Tyr Phe Tyr Ser Gly 
20 25 30 

55 

AAT AA.C TCA GGA GGA AGT CAG CAA ATC AAC TAT ACT GAG TTG GTA CAA 14 4 

Asn Asn Ser Gly Gly Ser Gin Gin He Asn Tyr Thr Glu Leu Val Gin 
35 40 45 



60 GAA ATT ACC GAT GGT AAT GAA AAA GAA TTA ACT TAC CAA CCA AAT GTT 192 
Glu He Thr Asp Gly Asn Glu Lys Glu Leu Thr Tyr Gin Pro Asn Val 
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AGT GT ATC GAA GTT TCT GGT GTC TAT AAA PAT CCT AAA ACA AGT AAA 

Ser Val He Glu Val Ser Gly Val Tyr Lys Asr. Pro Lys Thr Ser Lys 

65 ™ 75 80 

GAA GGA ACA GGT ATT CAG TTT TTC ACG CCA TCT GTT ACT AAG GTA GAG 

Glu Gly Thr Gly He Gin Phe Phe Thr Pro Ser Val Thr Lys Val Glu 
85 90 95 

AAA TTT ACC AGC ACT ATT CTT CCT GCA GAT ACT ACC GTA TCA GAA TTG 

'" " r Ser Thr He Leu Pro Ala Asp Thr Thr Val Ser Glu Leu 
100 105 HO 



Lys 



CAA AAA CTT GCT ACT GAC CAT AAA GCA GAA GTA ACT GTT AAG CAT GAA 
Gin Lys Leu Ala Thr Asp His Lys Ala Glu Val Thr Val Lys His Glu 
115 120 125 

AGT TCA AGT GGT ATA TGG ATT AAT CTA CTC GTA TCC ATT GTG CCA TTT 
Ser Ser Ser Gly He Trp He Asn Leu Leu Val Ser He Val Pro Phe 
130 135 1"0 

GGA ATT CTA TTC TTC TTC CTA TTC TCT ATG ATG GGA AAT ATG GGA GGA 
Glv He Leu Phe Phe Phe Leu Phe Ser Met Met Gly Asn Met Gly Gly 
145 150 155 160 

GGC AAT GGC CGT AAT CCA ATG AGT TTT GGA CGT AGT AAG GCT AAA GCA 
Gly Asn Gly Arg Asn Pro Met Ser Phe Gly Arg Ser Lys Ala Lys Ala 
165 HO 175 

GCA AAT AAA GAA GAT ATT AAA GTA AGA TTT TCA GAT GTT GCT GGA GCT 
Ala Asn Lys Glu Asp He Lys Val Arg Phe Ser Asp Val Ala Gly Ala 
180 185 190 

GAG GAA GAA AAA CAA GAA CTA GTT GAA GTT GTT GAG TTC TTA AAA GAT 
Glu Glu Glu Lys Gin Glu Leu Val Glu Val Val Glu Phe Leu Lys Asp 
195 200 205 

CCA AAA CGA TTC ACA AAA CTT GGA GCC CGT ATT CCA GCA GGT GTT CTT 
Pro Lys Arg Phe Thr Lys Leu Gly Ala Arg He Pro Ala Gly Val Leu 
210 215 220 

TTG GAG GGA CCT CCG GGG ACA GGT AAG ACT TTG CTT GCT AAG GCA GTC 
Leu Glu Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala Val 
225 230 235 240 

GCT GGA GAA GCA GGT GTT CCA TTC TTT AGT ATC TCA GGT TCT GAC TTT 
Ala Gly Glu Ala Gly Val Pro Phe Phe Ser He Ser Gly Ser Asp Phe 
245 250 255 

GTA GAA ATG TTT GTC GGA GTT GGA GCT AGT CGT GTT CGC TCT CTT TTT 
Val Glu Met Phe Val Gly Val Gly Ala Ser Arg Val Arg Ser Leu Phe 
260 265 270 

GAG GAT GCC AAA AAA GCA GCA CCA GCT ATC ATC TTT ATC GAT CTA AAT 
Glu Asd Ala Lys Lys Ala Ala Pro Ala He He Phe He Asp Leu Asn 
275 280 285 

GAT GCT GTT GGA CGT CAA CGT GGA GTC GGT CTC GGC GGA GGT AAT GAC 
Asd Ala Val Gly Arg Gin Arg Gly Val Gly Leu Gly Gly Gly Asn Asp 
- 290 295 300 
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GAA CGT GAA CAA ACC TTG AAC CAA CTT TTG ATT GAG ATG GAT GGT TTT 
Glu Arg Glu Gin Thr Leu Asn Gin Leu Leu He Glu Met Asp Gly Phe 
305 310 315 320 

GAG GGA AAT GAA GGG ATT ATC GTC ATC GCT GCG ACA AAC CGT TCA GAT 
Glu Gly Asn Glu Gly He He Val He Ala Ala Thr Asn Arg Ser Asp 
325 330 335 

GTA CTT GAT CCT GCC CTT TTG CGT CCA GGA CGT TTT GAT AGA AAA GTA 
Val Leu Asp Pro Ala Leu Leu Arg Pro Gly Arg Phe Asp Arg Lys Val 
340 345 350 

TTG GTT GGC CGT CCT GAT GTT AAA GGT CGT GAA GCA ATC TTG AAA GTT 
Leu Val Gly Arg Pro Asp Val Lys Gly Arg Glu Ala He Leu Lys Val 
355 ' 360 365 

CAC GCT AAG AAC AAG CCT TTA GCA GAA GAT GTT GAT TTG AAA TTA GTG 
His Ala Lys Asn Lys Pro Leu Ala Glu Asp Val Asp Leu Lys Leu Val 
370 375 380 

GCT CAA CAA ACT CCA GGC TTT GTT GGT GCT GAT TTA GAG AAT GTC TTG 
Ala Gin Gin Thr Pro Gly Phe Val Gly Ala Asp Leu Glu Asn Val Leu 
385 390 395 400 

AAT GAA GCA GCT TTA GTT GCT GCT CGT CGC AAT AAA TCG ATA ATT GAT 
Asn Glu Ala Ala Leu Val Ala Ala Arg Arg Asn Lys Ser He He Asp 
405 410 415 

GCT TCA GAT ATT GAT GAA GCA GAA GAT AGA GTT ATT GCT GGA CCT TCT 
Ala Ser Asp He Asp Glu Ala Glu Asp Arg Val He Ala Gly Pro Ser 
420 425 430 

AAG AAA GAT AAG ACA GTT TCA CAA AAA GAA CGA GAA TTG GTT GCT TAC 
Lys Lys Asp Lys Thr Val Ser Gin Lys Glu Arg Glu Leu Val Ala Tyr 
435 440 445 

CAT GAG GCA GGA CAT ACC ATT GTT GGT CTA GTC TTG TCG ACT GCT CGC 
His Glu Ala Gly His Thr He Val Gly Leu Val Leu Ser Thr Ala Arg 
450 ' 455 460 

GTT GTC CAT AAG GTT ACA ATT GTA CCA CGC GGC CGT GCA GGC GGA TAC 
Val Val His Lys Val Thr He Val Pro Arg Gly Arg Ala Gly Gly Tyr 
465 470 475 480 

ATG ATT GCA CTT CCT AAA GAG GAT CAA ATG CTT CTA TCT AAA GAA GAT 
Met He Ala Leu Pro Lys Glu Asp Gin Met Leu Leu Ser Lys Glu Asp 
485 490 495 

ATG AAA GAG CAA TTG GCT GGC TTA ATG GGT GGA CGT GTA GCT GAA GAA 
Met Lys Glu Gin Leu Ala Gly Leu Met Gly Gly Arg Val Ala Glu Glu 
500 505 510 

ATT ATC TTT AAT GTC CAA ACT ACA GGA GCT TCA AAC GAC TTT GAA CAA 
He He Phe Asn Val Gin Thr Thr Gly Ala Ser Asn Asp Phe Glu Gin 
515 520 525 

GCG ACA CAA ATG GCA CGT GCA ATG GTT ACA GAG TAC GGT ATG AGT GAA 
Ala Thr Gin Met Ala Arg Ala Met Val Thr Glu Tyr Gly Met Ser Glu 
530 535 540 
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AAA CTT GGC CCA GTA CAA TAT GAA GGA A^C CAT GCT ATG CTT GGT GCA 
Lys Leu Gly Pro Val Gin Tyr Glu Gly Asn His Ala Met Leu Gly Ala 
545 " 550 555 560 

5 CAG AGT CCT CAA AAA TCA ATT TCA GAA CAA ACA GCT TAT GAA ATT GAT 
Glr. Ser Pro Gin Lys Ser lie Ser Glu Gin Thr Ala Tyr Glu He Asp 
565 570 575 

GAA GAG GTT CGT TCA TTA TTA AAT GAG GCA CGA AAT AAA GCT GCT GAA 
10 Glu Glu Val Arg Ser Leu Leu Asn Glu Ala Arg Asn Lys Ala Ala Glu 
580 585 590 

ATT ATT CAG TCA AAT CGT GAA ACT CAC AAG TTA ATT GCA GAA GCA TTA 
He He Gin Ser Asn Arg Glu Thr His Lys Leu He Ala Glu Ala Leu 
15 595 600 605 

TTG AAA TAC GAA ACA TTG GAT AGT ACA CAA ATT AAA GCT CTT TAC GAA 

Leu Lys Tyr Glu Thr Leu Asp Ser Thr Gin He Lys Ala Leu Tyr Glu 

610 615 620 

20 

ACA GGA AAG ATG CCT GAA GCA GTA GAA GAG GAA TCT CAT GCA CTA TCC 

Thr Gly Lys Met Pro Glu Ala Val Glu Glu Glu Ser His Ala Leu Ser 

625 630 635 640 

2 5 TAT GAT GAA GTA AAG TCA AAA ATG AAT GAC GAA AAA TAA 
Tyr Asp Glu Val Lys Ser Lys Met Asn Asp Glu Lys 
645 650 



30 (2) INFORMATION FOR SEQ ID NO:118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
3 5 ! D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:118: 

40 

Met Lys Lys Gin Asn Asn Gly Leu He Lys Asn Pro Phe Leu Trp Leu 
15 10 15 

Leu Phe He Phe Phe Leu Val Thr Gly Phe Gin Tyr Phe Tyr Ser Gly 
45 20 25 30 

Asn Asn Ser Gly Gly Ser Gin Gin He Asn Tyr Thr Glu Leu Val Gin 
35 40 45 

50 Glu He Thr Asp Gly Asn Glu Lys Glu Leu Thr Tyr Gin Pro Asn Val 
50 55 60 

Ser Val He Glu Val Ser Gly Val Tyr Lys Asn Pro Lys Thr Ser Lys 
65 70 75 80 

55 

Glu Gly Thr Gly He Gin Phe Phe Thr Pro Ser Val Thr Lys Val Glu 
85 90 95 

Lys Phe Thr Ser Thr He Leu Pro Ala Asp Thr Thr Val Ser Glu Leu 
60 100 105 HO 



1824 
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Gin Lys Leu Ala Thr Asp His Lys Ala Glu Val Thr Val Lys His Glu 
115 120 125 

Ser Ser Ser Gly He Trp He Asn Leu Leu Val Ser He Val Pro Phe 
5 130 135 140 

Gly He Leu Phe Phe Phe Leu Phe Ser Met Met Gly Asn Met Gly Gly 
145 150 155 160 

10 Gly Asn Gly Arg Asn Pro Met Ser Phe Gly Arg Ser Lys Ala Lys Ala 
165 170 175 

Ala Asn Lys Glu Asp He Lys Val Arg Phe Ser Asp Val Ala Gly Ala 
180 185 190 

15 

Glu Glu Glu Lys Gin Glu Leu Val Glu Val Val Glu Phe Leu Lys Asp 
195 200 205 

Pro Lys Arg Phe Thr Lys Leu Gly Ala Arg He Pro Ala Gly Val Leu 
20 210 215 220 

Leu Glu Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala Val 
225 230 235 240 

2 5 Ala Gly Glu Ala Gly Val Pro Phe Phe Ser He Ser Gly Ser Asp Phe 
245 250 255 

Val Glu Met Phe Val Gly Val Gly Ala Ser Arg Val Arg Ser Leu Phe 
260 265 270 

30 

Glu Asp Ala Lys Lys Ala Ala Pro Ala He He Phe He Asp Leu Asn 
275 280 285 

Asp Ala Val Gly Arg Gin Arg Gly Val Gly Leu Gly Gly Gly Asn Asp 
35 290 295 300 

Glu Arg Glu Gin Thr Leu Asn Gin Leu Leu He Glu Met Asp Gly Phe 
305 310 315 320 

4 0 Glu Gly Asn Glu Gly He He Val He Ala Ala Thr Asn Arg Ser Asp 
325 330 335 

Val Leu Asp Pro Ala Leu Leu Arg Pro Gly Arg Phe Asp Arg Lys Val 
340 345 350 

45 

Leu Val Gly Arg Pro Asp Val Lys Gly Arg Glu Ala He Leu Lys Val 
355 360 365 

His Ala Lys Asn Lys Pro Leu Ala Glu Asp Val Asp Leu Lys Leu Val 
50 370 375 380 

Ala Gin Gin Thr Pro Gly Phe Val Gly Ala Asp Leu Glu Asn Val Leu 
385 390 395 400 

55 Asn Glu Ala Ala Leu Val Ala Ala Arg Arg Asn Lys Ser He He Asp 
405 410 415 

Ala Ser Asp He Asp Glu Ala Glu Asp Arg Val He Ala Gly Pro Ser 
420 425 430 

60 

Lys Lys Asp Lys Thr Val Ser Gin Lys Glu Arg Glu Leu Val Ala Tyr 
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Ala Thr Gin Met Ala Arg Ala Met Val Thr Glu Tyr Gly Met Ser Glu 
530 535 540 

20 

Lys Leu Gly Pro Val Gin Tyr Glu Gly Asn His Ala Met Leu Gly Ala 
545 550 555 560 



3 0 He He Gin Ser Asn Arg Glu Thr His Lys Leu He Ala Glu Ala Leu 
595 600 605 

Leu Lys Tyr Glu Thr Leu Asp Ser Thr Gin He Lys Ala Leu Tyr Glu 
610 615 620 

35 

Thr Gly Lys Met Pro Glu Ala Val Glu Glu Glu Ser His Ala Leu Ser 
625 630 635 640 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1278 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..1278 

(D) OTHER INFORMATION: FtsY 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NC:119: 

ATG GGA TTG TTT GAC CGT CTA TTC GGA AAA AAA GAA GAA CCT AAA ATC 
Met Gly Leu Phe Asp Arg Leu Phe Gly Lys Lys Glu Glu Pro Lys lie 
15 10 15 

GAA GAA GTT GTA AAA GAA GCT CTG GAA AAT CTT GAT TTG TCT GAA GAT 
Glu Glu Val Val Lys Glu Ala Leu Glu Asn Leu Asp Leu Ser Glu Asp 
20 25 30 

GTT GAT CCT ACC TTC ACA GAA GTT GAG GAA GTT TCT CAG GAA GAA GCA 
Val Asp Pro Thr Phe Thr Glu Val Glu Glu Val Ser Gin Glu Glu Ala 
35 40 45 

GAG GTT GAA ATT GTT GAA CAA GCT GTG TTC CAA GAA GAG GAA ATC CAA 
Glu Val Glu He Val Glu Gin Ala Val Phe Gin Glu Glu Glu He Gin 
50 55 60 

GAC ACA GTT GAA GAA AGT CTG GAT TTA GAG CCA GTT GTA GAA GTT TCT 
Asp Thr Val Glu Glu Ser Leu Asp Leu Glu Pro Val Val Glu Val Ser 
65 70 ' 75 80 

CAA AAA GAA GTC GAA GAA TTT CCA CAC TCA GAA GAA GGG AAT ACT GAG 
Gin Lys Glu Val Glu Glu Phe Pro His Ser Glu Glu Gly Asn Thr Glu 
85 90 95 

TTT CTA GAG ACT ATA GAA GAA AAT AAT TCT GAA GTT CTT GAA CCA GAA 
Phe Leu Glu Thr He Glu Glu Asn Asn Ser Glu Val Leu Glu Pro Glu 
100 105 HO 

AGG CCT CAA GCA GAA GAA ACC GTT CAG GAA AAA TAT GAC CGC AGT CTT 
Arg Pro Gin Ala Glu Glu Thr Val Gin Glu Lys Tyr Asp Arg Ser Leu 
115 120 125 

AAG AAA ACT CGT ACA GGT TTC GGT GCC CGC TTG AAT GCC TTC TTT GCT 
Lys Lys Thr Arg Thr Gly Phe Gly Ala Arg Leu Asn Ala Phe Phe Ala 
130 135 140 

AAC TTC CGC TCT GTT GAC GAA GAA TTT TTC GAG GAA CTG GAA GAA CTG 
Asn Phe Arg Ser Val Asp Glu Glu Phe Phe Glu Glu Leu Glu Glu Leu 
145 150 155 160 

CTG ATT ATG AGT GAT GTT GGT GTC CAA GTC GCT TCT AAC TTA ACG GAG 
Leu He Met Ser Asp Val Gly Val Gin Val Ala Ser Asn Leu Thr Glu 
165 170 175 

GAA CTA CGT TAC GAA GCC AAG CTT GAA AAT GCC AAG AAA CCT GAT GCA 
Glu Leu Arg Tyr Glu Ala Lys Leu Glu Asn Ala Lys Lys Pro Asp Ala 
180 185 190 

CTT CGT CGT GTC ATC ATT GAG AAA TTG GTT GAG CTT TAT GAA AAG GAT 
Leu Arg Arg Val He He Glu Lys Leu Val Glu Leu Tyr Glu Lys Asp 
195 200 205 

GGT AGC TAC GAT GAA AGC ATC CAC TTC CAA GAT AAC TTG ACA GTT ATG 
Gly Ser Tyr Asp Glu Ser He His Phe Gin Asp Asn Leu Thr Val Met 
210 215 220 

CTC TTT GTT GGT GTG AAT GGT GTT GGG AAA ACA ACT TCT ATC GGA AAA 
Leu Phe Val Gly Val Asn Gly Val Gly Lys Thr Thr Ser He Gly Lys 
225 230 235 240 
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CTA GCC CAC CGC TAC AAA CAA GCT GGT AAG AAG GTC ATG CTG GTT GCA 
Leu Ala His Arg Tyr Lys Gin Ala Gly Lys Lys Val Met Leu Val Ala 
245 250 255 

GCA GAT ACC TTC CGT GCG GGT GCA GTA GCT CAG CTA GCT GAA TGG GGC 
Ala Asp Thr Phe Arg Ala Gly Ala Val Ala Gin Leu Ala Glu Trp Gly 
260 265 270 

CGA CGA GTA GAT GTT CCA GTA GTA ACT GGA CCT GAA AAA GCT GAT CCA 
Arg Arg Val Asp Val Pro Val Val Thr Gly Pro Glu Lys Ala Asp Pro 
275 * 280 285 

GCC AGC GTG GTC TTT GAT GGT ATG GAA CGT GCC GTG GCT GAA GGT ATC 
Ala Ser Val Val Phe Asp Gly Met Glu Arg Ala Val Ala Glu Gly lie 
290 295 300 

GAT ATT CTC ATG ATT GAT ACT GCT GGT CGT CTG CAA AAT AAG GAT AAC 
Asp He Leu Met He Asp Thr Ala Gly Arg Leu Gin Asn Lys Asp Asn 
305 310 315 320 

CTT ATG GCT GAG TTG GAA AAG ATT GGT CGT ATT ATC AAA CGT GTT GTG 
Leu Met Ala Glu Leu Glu Lys He Gly Arg He He Lys Arg Val Val 
325 330 335 

CCA GAA GCA CCA CAT GAA ACC TTC TTG GCA CTT GAT GCA TCA ACA GGT 
Pro Glu Ala Pro His Glu Thr Phe Leu Ala Leu Asp Ala Ser Thr Gly 
340 345 350 

CAA AAT GCC CTA GTA CAG GCC AAA GAA TTT TCG AAA ATC ACA CCT TTA 
Gin Asn Ala Leu Val Gin Ala Lys Glu Phe Ser Lys He Thr Pro Leu 
355 360 365 

ACG GGA ATT GTT TTG ACT AAG ATT GAT GGA ACT GCT CGA GGA GGT GTG 
Thr Gly He Val Leu Thr Lys He Asp Gly Thr Ala Arg Gly Gly Val 
370 375 360 

GTT CTA GCC ATT CGT GAA GAA CTC AAT ATT CCT GTA AAA TTG ATT GGT 
Val Leu Ala He Arg Glu Glu Leu Asn He Pro Val Lys Leu He Gly 
385 390 395 400 

TTT GGT GAA AAA ATC GAT GAT ATT GGA GAG TTT AAC TCA GAA AAC TTT 
Phe Gly Glu Lys He Asp Asp He Gly Glu Phe Asn Ser Glu Asn Phe 
405 410 415 

ATG AAA GGT CTC TTG GAA GGT TTA ATC TAA 
Met Lys Gly Leu Leu Glu Gly Leu He * 
420 425 



(2) INFORMATION FOR SEQ ID NO: 12 0: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 4 25 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
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Met Gly Leu Phe Asp Arg Leu Phe Gly Lys Lys Giu 

1 5 10 

Glu Glu Val Val Lvs Glu Ala Leu Glu Asr. Leu Asp 

20 ' 25 

Val Asp Pro Thr Phe Thr Glu Val Glu Glu Val Ser 

35 40 

Glu Val Glu He Val Glu Gin Ala Val Phe Gin Glu 

50 55 60 

Asp Thr Val Glu Glu Ser Leu Asp Leu Glu Pro Val 

65 70 75 

Gin Lys Glu Val Glu Glu Phe Pro His Ser Glu Glu 

85 90 



Glu Pro Lvs He 
15 

Leu Ser Glu Asp 

30 

Gin Glu Glu Ala 
45 

Glu Glu He Gin 

Val Glu Val Ser 
80 

Gly Asr. Thr Glu 
95 



Arg Pro Gin Ala Glu Glu Thr Val Gin Glu Lys Tyr j 
115 120 



Lys Lys Thr Arg Thr Gly Phe Gly Ala Arg Leu Asn 
130 135 " 140 



Ala Phe Phe Ala 



Leu Thr Val Met 



. Ala Glu Gly He 



Asp He Leu Met He Asp Thr Ala Gly Arg Leu Gin j 
305 310 ' 315 

Leu Met Ala Glu Leu Glu Lys He Gly Arg He He Lys Arg Val Val 
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Tnr Gly lie Val Leu Thr Lys lie Asp Gly Thr Ala Arg Gly Gly Val 

370 375 380 

Val Leu Ala lie Arg Glu Glu Leu Asn He Pro Val Lys Leu He Gly 

3B5 3S0 395 400 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 891 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..891 

(D) OTHER INFORMATION: HI1146 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

ATG ACA AAG AAA CAA CTT CAC TTG GTG ATT GTG ACA GGG ATG GGT GGC 
Met Thr Lys Lys Gin Leu His Leu Val He Val Thr Gly Met Gly Gly 
15 10 15 

GCA GGG AAA ACT GTA GCC ATT CAG TCC TTC GAG GAT CTA GGT TAT TTC 
Ala Gly Lys Thr Val Ala He Gin Ser Phe Glu Asp Leu Gly Tyr Phe 
20 25 30 

ACC ATT GAT AAT ATG CCG CCA GCT CTC TTG CCT AAG TTT TTG CAG CTG 
Thr He Asp Asn Met Pro Pro Ala Leu Leu Pro Lys Phe Leu Gin Leu 
35 40 ' 45 

GTT GAA ATT AAG GAA GAC AAT CCT AAG TTG GCC TTG GTA GTG GAT ATG 
Val Glu He Lys Glu Asp Asn Pro Lys Leu Ala Leu Val Val Asp Met 
50 55 60 

CGT AGT CGT TCT TTC TTT TCA GAG ATT CAA GCT GTT TTG GAT GAG TTG 
Arg Ser Arg Ser Phe Phe Ser Glu He Gin Ala Val Leu Asp Glu Leu 
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60 



GAA AAT CAA GAT GGT TTG GAT TTC AAA ATC CTC TTT TTG GAT GCG GCT 

Glu Asr. Gin Asp GIv Leu Asp Phe Lys lie Leu Pr.e Leu Asp Ala Ala 
85 90 95 

GAT AAG GAA TTG GTC GCT CGT TAC AAG GAA ACC AGA CGG AGT CAC CCA 

Asp Lys Glu Leu Val Ala Arg Tyr Lys Glu Thr Arg Arg Ser His Pro 
100 105 110 

CTA GCA GCA GAC GGT CGT ATT TTA GAT GGA ATC AAG TTG GAA CGT GAA 

Leu Ala Ala Asp Gly Arg lie Leu Asp Gly He Lys Leu Glu Arg Glu 
115 120 125 

CTC TTG GCA CCT TTG AAA AAT ATG AGC CAA AAT GTG GTG GAT ACG ACT 

Leu Leu Ala Pro Leu Lys Asn Met Ser Gin Asn Val Val Asp Thr Thr 
130 135 140 

GAA CTC ACT CCA CGT GAG CTG CGC AAA ACC CTT GCA GAG CAG TTT TCA 

Glu Leu Thr Pro Arg Glu Leu Arg Lys Thr Leu Ala Glu Gin Phe Ser 

145 150 155 160 

GAC CAA GAA CAA GCT CAG TCT TTC CGT ATC GAA GTC ATG TCT TTC GGA 

Asp Gin Glu Gin Ala Gin Ser Phe Arg He Glu Val Met Ser Phe Gly 
165 170 175 

TTT AAG TAT GGA ATC CCG ATT GAT GCG GAC TTG GTC TTT GAT GTC CGT 

Phe Lys Tyr Gly He Pro He Asp Ala Asp Leu Val Phe Asp Val Arg 
180 185 190 

TTC TTG CCA AAT CCC TAT TAT TTA CCA GAA CTG AGA AAC CAA ACG GGT 

Phe Leu Pro Asn Pro Tyr Tyr Leu Pro Glu Leu Arg Asn Gin Thr Gly 
195 " 200 205 

GTG GAT GAA CCT GTT TAT GAT TAT GTC ATG AAC CAT CCT GAG TCA GAA 

Val Asp Glu Pro Val Tyr Asp Tyr Val Met Asn His Pro Glu Ser Glu 
210 215 220 

GAC TTT TAT CAA CAT TTA TTG GCC TTG ATT GAG CCG ATT CTG CCA AGT 

Asp Phe Tyr Gin His Leu Leu Ala Leu lie Glu Pro He Leu Pro Ser 

225 230 235 240 

TAC CAA AAG GAA GGT AAG TCC GTT TTG ACC ATT GCC ATG GGA TGT ACG 

Tyr Gin Lys Glu Gly Lys Ser Val Leu Thr He Ala Met Gly Cys Thr 
245 250 255 

GGT GGA CAA CAC CGT AGT GTG GCA TTT GCT AAA CGC TTG GTG CAG GAC 

Gly Gly Gin His Arg Ser Val Ala Phe Ala Lys Arg Leu Val Gin Asp 
260 265 270 

TTA TCC AAG AAT TGG TCT GTT AAT GAA GGG CAT CGC GAC AAA GAC CGC 
Leu Ser Lys Asn Trp Ser Val Asn Glu Gly His Arg Asp Lys Asp Arg 
275 280 285 

AGA AAG GAA ACG GTA AAC CGT TCA TGA 
Arg Lys Glu Thr Val Asn Arg Ser * 
290 295 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 296 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Met Thr Lys Lys Gin Leu His Leu Val He Val Thr Gly Met Gly Gly 
10 1 5 10 15 



15 



Ala Gly Lys Thr Val Ala He Gin Ser Phe Glu Asp Leu Gly Tyr Phe 
20 25 30 

Thr He Asp Asn Met Pro Pro Ala Leu Leu Pro Lys Phe Leu Gin Leu 
35 40 45 

Val Glu He Lys Glu Asp Asn Pro Lys Leu Ala Leu Val Val Asp Met 
2 0 50 55 60 

Arg Ser Arg Ser Phe Phe Ser Glu He Gin Ala Val Leu Asp Glu Leu 
65 70 75 80 

25 Glu Asn Gin Asp Gly Leu Asp Phe Lys He Leu Phe Leu Asp Ala Ala 
85 90 95 

Asp Lys Glu Leu Val Ala Arg Tyr Lys Glu Thr Arg Arg Ser His Pro 
100 105 HO 

30 

Leu Ala Ala Asp Gly Arg He Leu Asp Gly He Lys Leu Glu Arg Glu 
115 " ' 120 125 

Leu Leu Ala Pro Leu Lys Asn Met Ser Gin Asn Val Val Asp Thr Thr 
35 130 135 140 

Glu Leu Thr Pro Arg Glu Leu Arg Lys Thr Leu Ala Glu Gin Phe Ser 
145 150 155 160 

40 Asp Gin Glu Gin Ala Gin Ser Phe Arg He Glu Val Met Ser Phe Gly 
165 170 175 

Phe Lys Tyr Gly He Pro He Asp Ala Asp Leu Val Phe Asp Val Arg 
180 185 190 

45 

Phe Leu Pro Asn Pro Tyr Tyr Leu Pro Glu Leu Arg Asn Gin Thr Gly 
195 200 205 

Val Asp Glu Pro Val Tyr Asp Tyr Val Met Asn His Pro Glu Ser Glu 
50 210 215 220 

Asp Phe Tyr Gin His Leu Leu Ala Leu He Glu Pro He Leu Pro Ser 
225 " 230 235 240 

5 5 Tyr Gin Lvs Glu Gly Lys Ser Val Leu Thr He Ala Met Gly Cys Thr 
245 250 255 

Gly Gly Gin His Arg Ser Val Ala Phe Ala Lys Arg Leu Val Gin Asp 
260 265 270 

60 

Leu Ser Lys Asn Trp Ser Val Asn Glu Gly His Arg Asp Lys Asp Arg 
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Arg Lys Glu Thr Val Asr. Arg Ser 
290 295 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNSSS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Met Val Glu Val Pro Asp Glu Arg Leu Gin Lys Leu Thr Glu Met He 
1 5 10 15 

Thr Pro Lys Lys Thr Val Pro Thr Thr Phe Glu Phe Thr Asp He Ala 
20 25 30 

Gly He Val Lys Gly Ala Ser Lys Gly Glu Gly Leu Gly Asn Lys Phe 
35 40 45 

Leu Ala Asn He Arg Glu Val Asp Ala He Val His Val Val Arg Ala 
50 55 60 

Phe Asp Asp Glu Asn Val Met Arg Glu Gin Gly Arg Glu Asp Ala Phe 
65 70 75 80 

Val Asp Pro Leu Ala Asp He Asp Thr He Asn Leu Glu Leu He Leu 
85 90 95 

Ala Asp Leu Glu Ser Val Asn Lys Arg Tyr Ala Arg Val Glu Lys Met 
100 105 HO 

Ala Arg Thr Gin Lys Asp Lys Glu Ser Val Ala Glu Phe Asn Val Leu 
115 120 125 

Gin Lys He Lys Pro Val Leu Glu Asp Gly Lys Ser Ala Arg Thr He 
130 135 140 

Glu Phe Thr Asp Glu Glu Gin Lys Val Val Lys Gly Leu Phe Leu Leu 
145 150 155 160 

Thr Thr Lys Pro Val Leu Tyr Val Ala Asn Val Asp Glu Asp Val Val 
165 ~ 170 175 

Ser Glu Pro Asp Ser He Asp Tyr Val Lys Gin He Arg Glu Phe Ala 
180 185 190 

Ala Thr Glu Asn Ala Glu Val Val Val He Ser Ala Arg Ala Glu Glu 
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Glu lie Ser Glu Leu Asp Asp Glu Asp Lys Lys Glu Phe Leu Glu Ala 
210 215 220 



Arg Ala Trp Thr Phe Lys Arg Gly Met Lys Ala Pro Gin Ala Ala Gly 
260 265 270 



Ser Tyr Glu Asp Leu Val Lys Tyr Gly Ser Glu Lys Ala Val Lys Glu 
290 295 300 

Ala Gly Arg Leu Arg Glu Glu Gly Lys Glu Tyr lie Val Gin Asp Gly 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Met Ser Ala Ser Glu Gly Arg Asp Pro Tyr Glu Asp Tyr Leu Ala lie 
15 10 15 

Asn Lys Glu Leu Glu Ser Tyr Asn Leu Arg Leu Met Glu Arg Pro Gin 
20 25 30 

lie He Val Thr Asn Lys Met Asp Met Pro Glu Ser Gin Glu Asn Leu 
35 40 45 

Glu Glu Phe Lys Lys Lys Leu Ala Glu Asn Tyr Asp Glu Phe Glu Glu 
50 55 60 

Leu Pro Ala He Phe Pro He Ser Gly Leu Thr Lys Gin Gly Leu Ala 
65 70 75 80 

Thr Leu Leu Asp Ala Thr Ala Glu Leu Leu Asp Lys Thr Pro Glu Phe 
85 90 95 
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Asp Glu Glu Glu Lys Ala Phe Glu lie Ser Arg Asp Asp Asp Ala Thr 
115 120 125 

Trp Val Leu Ser Gly Glu Lys Leu Met Lys Leu Phe Asn Met Thr Asn 
130 135 140 

Phe Asp Arg Asp Glu Ser Val Met Lys Phe Ala Arg Gin Leu Arg Gly 
145 150 155 160 

Met Gly Val Asp Glu Ala Leu Arg Ala Arg Gly Ala Lys Asp Gly Asp 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 226 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Met Asn lie Gin Gin Leu Arg Tyr Val Val Ala lie Ala Asn Ser Gly 
15 10 15 

Thr Phe Arg Glu Ala Ala Glu Lys Met Tyr Val Ser Gin Pro Ser Leu 
20 25 30 

Ser He Ser Val Arg Asp Leu Glu Lys Glu Leu Gly Phe Lys He Phe 
35 40 45 

Arg Arg Thr Ser Ser Gly Thr Phe Leu Thr Arg Arg Gly Met Glu Phe 
50 55 60 

Tyr Glu Lys Ala Gin Glu Leu Val Lys Gly Phe Asp He Phe Gin Asn 
65 70 75 80 

Gin Tyr Ala Asn Pro Glu Glu Glu Lys Asp Glu Phe Ser Val Ala Ser 
85 90 95 

Gin His Tyr Asp Phe Leu Pro Pro Thr He Thr Ala Phe Ser Glu Arg 
100 105 110 

Tyr Pro Asp Tyr Lys Asn Phe Arg He Phe Glu Ser Thr Thr Val Gin 
115 120 125 

He Leu Asp Glu Val Ala Gin Gly His Ser Glu He Gly He He Tyr 
130 135 140 
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Leu Asn Asr. Gin Asn Lvs Lys Gly lie Met Gin Arg Val Glu Lys Leu 
145 150 155 160 

Gly Leu Glu Val He Glu Leu He Pro Pne His Thr His He Tyr Leu 
165 170 175 

Cys Glu Gly His Pro Leu Ala Gin Lys Glu Glu Leu Val Met Glu Asp 
180 185 190 

Leu Ala Asp Leu Pro Thr Val Arg Phe Thr Gin Glu Lys Asp Glu Tyr 
195 200 205 

Leu Tyr Tyr Ser Glu Asn Phe Val Asp Thr Ser Ala Thr His Arg Cys 
210 215 220 



20 (2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

30 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:126: 

Met Lys Lys Arg Ala He Val Ala Val He Val Leu Leu Leu He Gly 
15 10 15 

4 0 Leu Asp Gin Leu Val Lys Ser Tyr He Val Gin Gin He Pro Leu Gly 

20 25 30 

Glu Val Arg Ser Trp He Pro Asn Phe Val Ser Leu Thr Tyr Leu Gin 
35 " 40 45 

45 

Asn Arg Gly Ala Ala Phe Ser He Leu Gin Asp Gin Gin Leu Leu Phe 
50 55 60 

Ala Val He Thr Leu Val Val Val He Gly Ala He Trp Tyr Leu His 
50 65 70 75 80 

Lys His Met Glu Asp Ser Phe Trp Met Val Leu Gly Leu Thr Leu He 
85 90 95 

55 He Ala Gly Gly Pro Gly Asn Phe He Asp Arg Val Ser Gin Gly Phe 

100 105 110 



(2) INFORMATION FOR SEQ ID NO: 127: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

10 (ii-) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Met Ser Lys Tyr Leu Leu Lys Leu Leu Val Tyr Cys Phe Ser Ala Leu 
15 10 15 

20 Thr Phe Gly Ser Leu Phe Leu He He Gly Phe He Leu He Lys Gly 

20 25 30 

Leu Pro His Leu Ser Leu Ser Leu Phe Ser Trp Thr Tyr Thr Ser Glu 
35 40 45 

25 

Asn He Ser Leu Met Pro Ala He He Ser Thr Val He Leu Val Phe 
50 55 60 

Gly Ala Leu Leu Leu Ala Leu Pro He Gly He Phe Ala Gly Phe Tyr 
30 65 70 75 80 

Leu Val Glu Tyr Thr Lys Lys Asp Ser Leu Cys Val Lys He Met Arg 
85 90 95 

35 Leu Ala Ser Asp Thr Leu Ser Gly He Pro Ser He Val Phe Gly Leu 

100 105 HO 

Phe Gly Met Leu Phe Phe Val Val Phe Leu Gly Phe Gin Tyr Ser Leu 
115 120 125 

40 

Leu Ser Gly He Leu Thr Ser Val He Met Val Leu Pro Val He He 
130 135 140 

Arg Ser Thr Glu Glu Ala Leu Leu Ser Val Ser Asp Ser Met Arg Gin 
45 145 150 155 160 

Ala Ser Tyr Gly Leu Gly Ala Leu Ser Tyr 
165 170 

50 (2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 amino acids 

(B) TYPE: amino acid 

55 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
60 (iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:128: 

Met Lys Thr Glu Gin Thr Ala Ser Lvs Thr Ser Ala Leu Lys Gly Lvs 
15 10 15 

Glu Val Ala Asp Phe Glu Leu Met Gly Val Asp Gly Lys Thr Tyr Arg 
20 25 30 

Leu Ser Asp Tyr Lys Gly Lys Lys Val Tyr Leu Lys Phe Trp Ala Ser 
35 " 40 45 

Trp Cys Ser lie Cys Leu Ala Ser Leu Pro Asp Thr Asp Glu lie Ala 
50 55 60 

Lys Glu Ala Gly Asp Asp Tyr Val Val Leu Thr Val Val Ser Pro Gly 
65 70 75 80 

His Lys Gly Glu Gin Ser Glu Ala Asp Phe Lys Asn Trp Tyr Lys Gly 
85 90 95 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 



Met Lys Lys Glu Gin lie Pro Asn Leu Leu Thr lie Gly Arg lie Leu 

1 5 10 15 

Phe He Pro He Phe He Phe He Leu Thr He Gly Asn Ser He Glu 
20 25 30 

Ser His He Val Ala Ala He He Phe Ala Val Ala Ser He Thr Asp 
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35 40 45 

Tvr Leu Asp Gly Tyr Leu Ala Arg Lys Trp Asr. Val Val Ser Asn Phe 
50 55 60 

Gly Lys Phe Ala Asp Pro Met Ala Asp Lys Leu Leu Val Met Ser Ala 
65 70 75 80 

Phe He Met Leu He Glu Leu Gly Met Ala Pro Ala Trp He Val Ala 
85 90 95 



Thr Leu Leu Gly Gin Val Leu Leu Tyr Val Ala Leu Phe Phe Thr He 
145 150 155 160 

Tyr Ser Gly Tyr Asp Tyr Phe Lys Gly Ser Ala Tyr Val Phe Lys Gly 



30 (2) INFORMATION FOR SEQ ID NO:130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 amino acids 

(B) TYPE: amino acid 

3 5 (C) STRANDEDNESS: not relevaj 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

40 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 



Leu Arg Leu Lys Glu Met Asn Gly Asp Met He His Ala Ala Tyr Asp 

15 10 15 

Leu Gly Ala Ser Gin Phe Gin Met Phe Lys Glu He Met Leu Pro Tyr 

20 25 30 

Leu Thr Pro Ser He He Ala Gly Tyr Phe Met Ala Phe Thr Tyr Ser 

35 40 45 

Leu Asp Asp Phe Ala Val Thr Phe Phe Val Thr Gly Asn Gly Phe Ser 

50 55 60 

Thr Leu Ser Val Glu He Tyr Ser Arg Ala Arg Lys Gly He Ser Leu 
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Glu He Asr. Ala Leu Ser Ala Leu Val Phe Leu Phe Ser He He Leu 
85 90 95 

Val Val Gly Tyr Tyr Phe He Ser Arg Glu Lys Glu Glu Gin Ala 
100 105 HO 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Pro Gin Phe Thr Glu Glu Thr Gly He Gin Val Gin Tyr Glu Ala Phe 
15 10 15 

Asp Ser Asn Glu Ala Met Tyr Thr Lys He Lys Gin Gly Gly Thr Thr 
20 "25 30 

Tyr Asp He Ala He Pro Ser Glu Tyr Met lie Asn Lys Met Lys Asp 
35 40 45 

Glu Asp Leu Leu Val Pro Leu Asp Tyr Ser Lys 
50 55 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 232 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Met Gin Thr Gin Glu Lys His Ser Gin Ala Ala Val Leu Gly Leu Gin 
15 10 15 

His Leu Leu Ala Met Tyr Ser Gly Ser He Leu Val Pro He Met He 
20 25 30 

Ala Thr Ala Leu Gly Tyr Ser Ala Glu Gin Leu Thr Tyr Leu He Ser 
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35 40 45 

Tnr Asp lie Pne Met Cys Gly Val Ala Thr Phe Leu Gin Leu Gin Leu 
50 55 60 

Asn Lys Tyr Phe Gly lie Gly Leu Pro Val Val Leu Gly Val Ala Phe 
65 " 70 75 80 

Gin Ser Val Ala Pro Leu He Met He Gly Gin Ser His Gly Ser Gly 
85 90 95 

Ala Met Phe Gly Ala Leu He Ala Ser Gly He Tyr Val Val Leu Val 
100 105 110 

Ser Gly He Phe Ser Lys Val Ala Asn Leu Phe Pro Ser He Val Thr 
115 120 125 

Gly Ser Val He Thr Thr He Gly Leu Thr Leu He Pro Val Ala He 
130 135 140 

Gly Asn Met Gly Asn Asn Val Pro Glu Pro Thr Gly Gin Ser Leu Leu 
145 150 155 160 

Leu Ala Ala He Thr Val Leu He He Leu Leu He Asn He Phe Thr 
165 170 175 



Thr Ala lie Ala Ala Thr Met Gly Leu Val Asp Phe Ser Pro Val Ala 
195 200 205 

Val Val His Leu Ser Met Ser Gin Leu His Ser Thr Leu Gly Cys Gin 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 343 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 133: 

Lys Val Pro Val Tyr Leu Gly Ser Ser Phe Ala Phe He Thr Ala Met 
1 5 10 15 

Ser Leu Ala Met Lys Glu Met Gly Gly Asp Val Ser Ala Ala Gin Thr 
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Giy Val lie Leu Thr Gly Leu Val Tyr Val Leu Val Ala Thr Ser lie 
35 ' 40 45 

Arg Phe Val Gly Thr Lys Trp lie Asp Lys Leu Leu Pro Pro lie He 
50 55 60 

He Gly Pro Met He He Val lie Gly Leu Gly Leu Ala Gly Ser Ala 
65 70 75 80 

Val Thr Asn Ala Gly Leu Val Ala Asp Gly Asn Trp Lys Asn Ala Leu 
85 90 95 

Val Ala Val Val Thr Phe Leu lie Ala Ala Phe He Asn Thr Lys Gly 
100 105 110 

Lys Gly Phe Leu Arg He He Pro Phe Leu Phe Ala He He Gly Gly 
115 120 125 

Tyr Leu Phe Ala Leu Thr Leu Gly Leu Val Asp Phe Thr Pro Val Leu 
130 135 140 

Lys Ala Asn Trp Phe Glu He Pro Gly Phe Tyr Leu Pro Phe Ser Thr 
145 150 155 160 

Gly Gly Ala Phe Lys Glu Tyr Asn Leu Tyr Phe Gly Pro Glu Ala He 
165 170 175 

Ala He Leu Pro He Ala He Val Thr He Ser Glu His He Gly Asp 
180 185 190 

His Thr Val Leu Gly Gin He Cys Gly Arg Gin Phe Leu Lys Glu Pro 
195 " 200 205 

Gly Leu His Arg Thr Leu Leu Gly Asp Gly He Ala Thr Ser Val Ser 
210 215 220 

Ala Phe Leu Gly Gly Pro Ala Asn Thr Thr Tyr Gly Glu Asn Thr Gly 
225 230 235 240 

Val He Gly Met Thr Arg He Ala Ser Val Ser Val He Arg Asn Ala 
245 250 255 

Ala Phe lie Ala He Ala Leu Ser Phe Leu Gly Lys Phe Thr Ala Leu 
260 265 270 

lie Ser Thr lie Pro Asn Ala Val Leu Gly Gly Met Ser lie Leu Leu 
275 280 285 

Tyr Gly Val lie Ala Ser Asn Gly Leu Lys Val Leu He Lys Glu Arg 
290 295 300 

Val Asp Phe Ala Gin Met Arg Asn Leu He He Ala Ser Ala Met Leu 
305 310 315 320 

Val Leu Gly Leu Gly Gly Ala He Leu Lys Leu Gly Pro Val His Phe 
325 330 335 



(2) INFORMATION FOR SEQ ID NO: 134: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 amino acids 
(E) TYPE: amino acid 
5 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



Ala He He He Ser 
15 

Pro Lys Leu Gly Ala 
30 

Phe Pro Pro He Leu 
45 

Val Gly Leu Thr Asn 
60 

Phe Ser Val Pro Tyr 
80 

Val Pro He Gly He 
95 

Phe Val Thr Phe Tyr 
110 

Val Ala Thr Ala He 
12 5 

Tyr Ala Leu He Leu 
140 

Ala Leu Arg Ser Leu 
160 

Met Ala Ala Ser Val 
175 

50 He Val Val Leu Pro Ser He He 

180 

(2) INFORMATION FOR SEQ ID NO: 135: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

60 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Ser Leu He lie Ala Leu Ala Thr Thr Leu He 
15 10 

Ala Met Ala Ala Tyr Gly He Val Arg Phe Phe 
20 25 

He Met Ser Arg Leu Leu Val He Thr Tyr He 
35 40 

Leu Ala He Pro Tyr Ser He Ala He Ala Lys 
50 55 

Ser Leu Phe Gly Leu Met Met Val Tyr Leu Ser 



Ala Val Trp Leu Leu Val Gly Phe Phe Gin Thr 
85 90 



He Asn Asn Thr Gly Lys Met Thr Val Ala Val 
145 150 155 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:135: 

Asp Glu Leu Ala Asp Leu Met Met Val Ala Ser Lys Glu Val Glu Asp 
15 10 15 

Ala He He Arg Leu Gly Gin Lys Ala Arg Ala Ala Gly He His Met 
20 25 30 

He Leu Ala Thr Gin Arg Pro Ser Val Asp Val He Ser Gly Leu He 
35 40 45 

Lys Ala Asn Val Pro Ser Arg Val Ala Phe Ala Val Ser Ser Gly Thr 
50 55 60 

Asp Ser Arg Thr He Leu Asp Glu Asn Gly Ala Glu Lys Leu Leu Gly 
65 70 75 80 

Arg Gly Asp Met Leu Phe Lys Pro He Asp Glu Asn His Pro Val Arg 
85 90 95 

Leu Gin Gly Ser Phe He Ser Asp Asp Asp Val Glu Arg He Val Asn 
100 105 110 



Gly Glu Val Ser Glu Asn Glu Gly Glu Phe Ser Asp Gly Asp Ala Gly 
130 135 140 



Gly Asp Pro Leu Phe Glu Glu Ala Lys Ser Leu 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 154 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Met Thr Glu Asn Thr Pro Lys Ala Leu Val Gin Val Asn Gin Lvs Pro 
15 10 15 

Leu He Glu Tyr Gin He Glu Phe Leu Lys Glu Lys Gly He Asn Asp 
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He He He He Val Gly Tyr Leu Lys Glu Gin Phe Asp Tyr Leu Lys 
35 40 45 

Glu Lys Tyr Gly Val Arg Leu Val Phe Asn Asp Lys Tyr Ala Asp Tyr 
50 55 60 

Asn Asn Phe Tyr Ser Leu Tyr Leu Val Lys Glu Glu Leu Ala Asn Ser 



Tyr Val He Asp Ala Asp Asn Tyr Leu Phe Lys Asn Met Phe Arg Asn 
85 " 90 95 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 286 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137 : 



Met Ser Asp Asn Ser Lys Thr Arg Val Val Val Gly Met Ser Gly Gly 
15 10 15 

Val Asp Ser Ser Val Thr Ala Leu Leu Leu Lys Glu Gin Gly Tyr Asp 
20 25 30 

Val He Gly He Phe Met Lys Asn Trp Asp Asp Thr Asp Glu Asn Gly 
35 40 45 

Val Cys Thr Ala Thr Glu Asp Tyr Lys Asp Val Val Ala Val Ala Asp 
50 55 60 

Gin He Gly He Pro Tyr Tyr Ser Val Asn Phe Glu Lys Glu Tyr Trp 
65 70 75 80 



60 



Asp Arg Val Phe Glu Tyr Phe Leu Ala Glu Tyr Arg Ala Gly Arg Thr 
85 90 95 
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Tyr Ala Arg Val Ala Arg Asp Glu Asp Gly Thr Val His Met Leu Arg 
130 135 140 



Gin Glu Gin Leu Gin Lys Thr Met Phe Pro Leu Gly His Leu Lys Lys 
165 170 ' 175 



Val Asp Gly Arg Asp Met Gly Glu His Ala Gly Leu Met Tyr Tyr Thr 
225 230 235 240 

lie Gly Gin Arg Gly Gly Leu Gly lie Gly Gly Gin His Gly Gly Asp 
245 250 255 



(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 

Met Glu Val Phe Glu Ser Leu Lys Ala Asn Leu Val Gly Lys Asn Ala 
15 10 15 

Arg He Val Leu Pro Glu Gly Glu Glu Pro Arg He Leu Gin Ala Thr 
20 25 30 

Lys Arg Leu Val Lys Glu Thr Glu Val He Pro Val Leu Leu Gly Asn 
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35 40 45 

Fro Glu Lvs lie Lvs He Tyr Leu Glu He Glu Gly He Met Asp Gly 
50 55 60 

Tyr Glu Val He Asp Pro Gin His Tyr Pro Gin Phe Glu Glu Met Val 
65 70 ^ 80 

Ser Ala Leu Val Glu Arg Arg Lys Gly Lys Met Thr Glu Glu Asp Val 
85 90 95 

Arg Lys Val Leu Val Glu Asp Val Asn Tyr Phe Gly Val Met Leu Val 
100 105 HO 

Tyr Leu Gly Leu Val Asp Gly Met Val Ser Gly Ala He His Ser Thr 
115 120 125 

Ala Ser Thr Val Arg Pro Ala Leu Gin He He Lys Thr Arg Pro Asn 
130 135 140 

Val Thr Arg Thr Ser Gly Ala Phe Leu Met Val Arg Gly Thr Glu Arg 
145 150 155 160 

Tyr Leu Phe Gly Asp Cys Ala He Asn He Asn Pro Asp Ala Glu Ala 
165 170 175 

Leu Ala Glu He Ala He Asn Ser Ala He Thr Ala Lys Met Phe Gly 
180 185 190 

He Glu Pro Lys He Ala Met Leu Ser Tyr Ser Thr Lys Gly Ser Gly 
195 200 205 

Phe Gly Glu Ser Val Asp Lys Val Val Glu Ala Thr Lys He Ala His 
210 215 220 

Asp Leu Arg Pro Asp Leu Glu He Asp Gly Glu Leu Gin Phe Asp Ala 
225 230 235 240 

Ala Phe Val Pro Glu Thr Ala Ala Leu Lys Ala Pro Gly Ser Thr Val 
245 250 255 

Ala Gly Gin Ala Asn Val Phe He Phe Pro Gly He Glu Ala Gly Asn 
260 265 270 

He Gly Tyr Lys Met Ala Glu Arg Leu Gly Gly Phe Ala Ala Val Gly 
275 280 285 

Pro Val Leu Gin Gly Leu Asn Lys Pro Val Asn Asp Leu Ser Arg Gly 
290 295 300 

Cvs Asn Ala Asp Asp Val Tyr Lys Leu Thr Leu He Thr Ala Ala Gin 
305 310 315 320 

Ala Val His Gin Met Glu Val Phe Glu Ser Leu Lys Ala Asn Leu Val 
325 330 335 

Gly Lys Asn Ala Arg He Val Leu Pro Glu Gly Glu Glu Pro Arg He 
340 345 350 

Leu Gin Ala Thr Lys Arg Leu Val Lys Glu Thr Glu Val He Pro Val 
355 360 365 
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Leu Leu Gly Asn Pro Glu Lys He Lvs He Tyr Leu Glu He Glu Gly 

370 375 380 

He Met Asp Gly Tyr Glu Val He Asp Pro Gin His Tyr Pro Gin Phe 

385 390 395 400 



Glu Glu Asp Val Arg Lys Val Leu Val Glu Asp Val Asn Tyr Phe Gly 
420 425 430 

Val Met Leu Val Tyr Leu Gly Leu Val Asp Gly Met Val Ser Gly Ala 
435 ' 440 445 

He His Ser Thr Ala Ser Thr Val Arg Pro Ala Leu Gin He He Lys 
450 455 460 

Thr Arg Pro Asn Val Thr Arg Thr Ser Gly Ala Phe Leu Met Val Arg 
465 470 475 480 

Gly Thr Glu Arg Tyr Leu Phe Gly Asp Cys Ala He Asn He Asn Pro 
485 490 495 

Asp Ala Glu Ala Leu Ala Glu He Ala He Asn Ser Ala He Thr Ala 
500 505 510 

Lys Met Phe Gly He Glu Pro Lys He Ala Met Leu Ser Tyr Ser Thr 
515 520 525 

Lys Gly Ser Gly Phe Gly Glu Ser Val Asp Lys Val Val Glu Ala Thr 
530 ' 535 540 

Lys He Ala His Asp Leu Arg Pro Asp Leu Glu He Asp Gly Glu Leu 
545 550 555 560 

Gin Phe Asp Ala Ala Phe Val Pro Glu Thr Ala Ala Leu Lys Ala Pro 
565 570 575 

Gly Ser Thr Val Ala Gly Gin Ala Asn Val Phe He Phe Pro Gly He 
580 585 590 

Glu Ala Gly Asn He Gly Tyr Lys Met Ala Glu Arg Leu Gly Gly Phe 
595 600 605 

Ala Ala Val Gly Pro Val Leu Gin Gly Leu Asn Lys Pro Val Asn Asp 
610 615 620 

Leu Ser Arg Gly Cys Asn Ala Asp Asp Val Tyr Lys Leu Thr Leu He 
625 630 635 640 

Thr Ala Ala Gin Ala Val His Gin 
645 

(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: not relevant 
Hi) MOLECULE TYPE: peptide 
(ill) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:139: 

Met Arg Asn Leu Lys Ser He Leu Arg Arg His He Ser Leu Leu Gly 
15 10 15 

Phe Leu Gly Val Leu Ser He Trp Gin Leu Ala Gly Phe Leu Lys Leu 
20 25 30 

Leu Pro Lys Phe He Leu Pro Thr Pro Leu Glu He Leu Gin Pro Phe 
35 40 45 

Val Arg Asp Arg Glu Phe Leu Trp His His Ser Trp Ala Thr Leu Arg 
50 55 60 

Val Ala Leu Leu Gly Leu He Leu Gly Val Leu He Ala Cys Leu Met 
65 70 75 80 

Ala Val Leu Met Asp Ser Leu Thr Trp Leu Asn Asp Leu He Tyr Pro 
85 90 95 

Met Met Val Val He Gin Thr He Pro Thr He Ala He Ala Pro He 
100 105 HO 

Leu Val Leu Trp Leu Gly Tyr Gly He Phe Ala Gin Asp Cys Leu Asp 
115 * 120 125 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 0: 

Pro Trp Ser Leu Val Asp Glu Tyr Glu Gin Leu Tyr Ala Thr He Gly 
15 10 15 

Trp His Pro Thr Glu Ala Gly Thr Tyr Thr Glu Glu Val Glu Ala Tyr 
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Leu Leu Asp Lvs Leu Lys His Ser Lys Val Val Ala Leu Gly Glu He 
35 40 45 

Gly Leu Asp Tvr His Trp Met Thr Ala Pro Glu Val Gin Glu Gin Val 

50 55 60 

Phe Arg Arg Gin He Gin Leu Ser Lys Asp Leu Asp Leu Pro Phe Val 
65 70 75 80 

Val His Thr Arg Asp Ala Leu Glu Asp Thr Tyr Glu He He Lys Ser 
85 90 95 



Leu Ala Val Ala Thr Thr Ala Asn Ala Glu Arg He Phe Gly He Gly 
195 200 205 



(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not releva; 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 1: 

Met Lys He He He Gin Arg Val Lys Lys Ala Gin Val Ser He Glu 

1 " 5 10 15 

Gly Gin He Gin Gly Lys He Asn Gin Gly Leu Leu Leu Leu Val Gly 
20 25 30 

Val Gly Fro Glu Asp Gin Glu Glu Asp Leu Asp Tyr Ala Val Arg Lys 
35 40 45 

Leu Val Asn Met Arg He Phe Ser Asp Ala Glu Gly Lys Met Asn Leu 
50 55 60 

Ser Val Lys Asp He Glu Gly Glu He Leu Ser He Ser Gin Phe Thr 
65 70 75 80 

Leu Phe Ala Asp Thr Lys Lys Gly Asn Arg Pro Ala Phe Thr Gly Ala 

85 90 95 



Gin Val Glu Leu Val Asn Asn Gly Pro Val Thr He He Leu Asp Thr 
130 135 140 



3 5 (2) INFORMATION FOR SEQ ID NO: 142 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 238 amino acids 

(B) TYPE: amino acid 

4 0 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
45 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

5 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Met He Leu Ser Met Val Ser Thr Pro Leu Pro Ser Ser Pro Cys Lys 
15 10 15 

55 Tyr Arg Lys Gin Leu Tyr Leu Gin Glu Asp Leu Arg Gly Lys Asn Val 

20 ' 25 30 

Glu Lys Val Lys Glu Leu Ala Thr Glu Lys Lys Val Ser He Ser Trp 
35 40 45 

60 

Thr Ser Lys Lys Ser Leu Ser Glu Met Thr Glu Gly Ala Val His Gin 
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50 55 60 

Glv Phe Val Leu Arg Val Ser Giu Phe Ala Tyr Ser Glu Leu Asp Tyr 
65" 70 75 80 

lie Leu Ala Lys Thr Arg Gin Glu Glu Asn Pro Leu Leu Leu He Leu 
e5 90 95 

Asp Gly Leu Thr Asp Pro His Asn Leu Gly Ser He Leu Arg Thr Ala 
100 105 110 

Asp Ala Thr Asn Val Ser Gly Val He He Pro Lys His Arg Ala Val 
115 120 125 

Gly Val Thr Pro Val Val Ala Lys Thr Ala Thr Gly Ala He Glu His 
130 135 140 

Val Pro He Ala Arg Val Thr Asn Leu Ser Gin Thr Leu Asp Lys Leu 
145 150 155 160 

Lys Asp Glu Gly Phe Trp Thr Phe Gly Thr Asp Met Asn Gly Thr Pro 
165 170 175 

Cys His Lys Trp Asn Thr Lys Gly Lys He Ala Leu He He Gly Asn 
180 185 190 

Glu Gly Lys Gly He Ser Ser Asn He Lys Lys Gin Val Asp Glu Met 
195 ' 200 205 

He Thr He Pro Met Asn Gly His Val Gin Ser Leu Asn Ala Ser Val 
210 215 220 

Ala Ala Ala He Leu Met Tyr Glu Val Phe Arg Asn Arg Leu 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 345 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:143: 

Met Val Gin Gin Ala Ala Thr Val Ser Leu Met Val Leu Phe Leu Val 
15 10 15 

Pro Gin Leu Arg Asn Ala Tyr Gly Thr Ala Ala He Gly He He Cys 
20 25 30 

Gly Leu Tyr Trp Ala Val Ser Ser Asn Met Thr Val Glu Ala Thr Gin 
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Arg Leu Thr Gly Gly Gly Gly Phe Ala He Gly His Gin Gin Gin Phe 
50 55 60 

Ala He Trp Phe Val Asp Lvs Val Ala Gly Arg Phe Gly Lys Lys Glu 
65 70 ' 75 60 

Glu Ser Leu Asp Asn Leu Lys Leu Pro Lys Phe Leu Ser He Phe His 
85 90 95 

Asp Thr Val Val Ala Ser Ala Thr Leu Met Leu Val Phe Phe Gly Ala 
100 105 HO 

He Leu Leu He Leu Gly Pro Asp He Met Ser Asn Lys Glu Val He 
115 120 125 

Thr Ser Gly Thr Leu Phe Asn Pro Ala Lys Gin Asp Phe Phe Met Tyr 
130 135 140 

He He Gin Thr Ala Phe Thr Phe Ser Val Tyr Leu Phe Val Leu Met 
145 150 155 160 

Gin Gly Val Arg Met Phe Val Ser Glu Leu Thr Asn Ala Phe Gin Gly 
165 170 175 

He Ser Asn Lys Leu Leu Pro Gly Ser Phe Pro Ala Val Asp Val Ala 
180 185 190 

Ala Ser Tyr Gly Phe Gly Ser Pro Asn Ala Val Leu Ser Gly Phe Thr 
195 200 205 

Phe Gly Leu He Gly Gin Leu He Thr He Val Leu Leu He Val Phe 
210 215 220 

Lys Asn Pro He Leu He He Thr Gly Phe Val Pro Val Phe Phe Asp 
225 230 235 240 

Asn Ala Ala He Ala Val Tyr Ala Asp Lys Arg Gly Gly Trp Lys Ala 
245 250 255 

Ala Val He Leu Ser Phe He Ser Gly Val Leu Gin Val Ala Leu Gly 
260 265 270 

Ala Leu Cys Val Ala Leu Leu Asp Leu Ala Ser Tyr Gly Gly Tyr His 
275 280 285 

Gly Asn He Asp Phe Glu Phe Pro Trp Leu Gly Phe Gly Tyr He Phe 
290 ' 295 300 

Lys Tyr Leu Gly He Val Gly Tyr Val Leu Val Cys Leu Phe Leu Leu 
305 310 315 320 

Val He Pro Gin Leu Gin Phe Ala Lys Ala Lys Asp Lys Glu Lys Tyr 
325 330 335 

Tyr Asn Gly Glu Val Gin Glu Glu Ala 
340 345 

(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 87 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 



Met Val Arg Pro He Gly He Tyr Glu Lys Ala Thr Pro Thr His Phe 

1 5 10 15 

Thr Trp Leu Glu Arg Leu Asn Phe Ala Lys Glu Leu Gly Phe Asp Phe 

20 25 30 

Val Glu Met Ser He Asp Glu Arg Asp Glu Arg Leu Ala Arg Leu Asp 
35 ' 40 45 

Trp Ser Lys Glu Glu Arg Leu Glu Val Val Lys Ala He Tyr Glu Thr 
50 55 60 

Gly Val Arg lie Pro Ser lie Cys Phe Ser Gly His Arg Arg Tyr Pro 

65 70 75 80 

Leu Gly Ser Lys Asp Pro Val Leu Glu Glu Lys Ser Leu Glu Leu Met 

85 90 95 



Asp Val Pro Phe Gly Gin Gly Cys Val Lys Trp Glu Glu Ala Phe Asp 
225 230 235 240 

He Leu Lys Glu Thr Asn Tyr Asn Gly Pro Phe Leu He Glu Met Trp 
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Ala Gin Ala Phe Leu Tyr Pro Leu lie Lys Lys Ala Gly Leu Met 
275 280 285 

(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Met Thr Lys Arg He Pro Asn Leu Gin Val Ala Leu Asp His Ser Asp 
15 10 15 

Leu Gin Gly Ala He Lys Ala Ala Val Ser Val Gly Gin Glu Val Asp 
20 25 30 

He He Glu Ala Gly Thr Val Cys Leu Leu Gin Val Gly Ser Glu Leu 
35 40 45 

Ala Glu Val Leu Arg Ser Leu Phe Pro Asp Lys He He Val Ala Asp 
50 55 60 

Thr Lys Cys Ala Asp Ala Gly Gly Thr Val Ala Lys Asn Asn Ala Val 
65 70 75 80 

Arg Gly Ala Asp Trp Met Thr Cys He Cys Cys Ala Thr He Pro Thr 
85 90 95 



Glu He Gin He Glu 
115 

5 0 Leu Trp Leu Asp Ala 

130 



Asp Ala Leu Leu Ala 
145 

55 

Val Lys Lys Leu He 
165 



Leu Tyr Gly Asp Trp Thr 

120 

Gly He Ser Gin Ala He 
135 

Gly Glu Thr Trp Gly Glu 
150 155 

Asp Met Gly Phe Arg Val 
170 



Phe Glu Gin Ala Gin 
125 

Tyr His Gin Ser Arg 
140 



Lys Asp Leu Asn Lys 
160 

Ser Val Thr Gly Gly 
175 



60 



Leu Asp Val Asp Thr Leu Lys Leu Phe Glu Gly Val Asp Val Phe Thr 
180 185 190 
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Phe He Ala Gly Arg Gly He Thr Glu Ala Ala Asp Pro Ala Gly Ala 
195 200 205 

Ala Arg Ala Phe Lys Asp Glu He Lys Arg He Trp Gly 
210 215 220 

(2) INFORMATION FOR SEQ ID NO: 14 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 161 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Met Asn Leu Lys Gin Ala Leu He Asp Asn Asp Ser He Arg Leu Gly 
15 10 15 

Leu Glu Ala Asn Asn Trp Lys Glu Ala Val Lys Val Ala Val Asp Pro 
20 25 30 

Leu He Glu Ser Gly Ala He Leu Pro Glu Tyr Tyr Asp Ala He He 
35 40 45 

Glu Ser Thr Glu Glu Tyr Gly Pro Tyr Tyr He Leu Met Pro Gly Met 
50 55 60 

Ala Met Pro His Ala Arg Pro Glu Ala Gly Val Gin Ser Asp Ala Phe 
65 70 75 80 

Ser Leu He Thr Leu Gin Asn Pro Val Val Phe Ser Asp Gly Lys Glu 
85 90 95 

Val Ser Val Leu Leu Ala Leu Ala Ala Thr Ser Ser Lys He His Thr 
100 105 110 

Ser Val Ala He Pro Gin He He Ala Leu Phe Glu Leu Glu Asp Ser 
115 120 125 

He Ala Arg Leu Gin Ala Cys Gin Thr Lys Glu Asp Val Leu Ala Met 
130 135 140 

He Glu Glu Ser Lys Asp Ser Pro Tyr Leu Glu Gly Leu Asp Leu Glu 
145 150 155 160 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 293 amino acids 

(B) TYPE: amino acid 
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<ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

Met Ser Arg Asp He He Lys Leu Asp Gin He Asp Val Thr Phe His 
15 10 15 

Gin Lys Lys Arg Thr He Thr Ala Val Lys Asp Val Thr He His He 
20 25 30 

Gin Glu Gly Asp He Tyr Gly lie Val Gly Tyr Ser Gly Ala Gly Lys 
35 40 45 

Ser Thr Leu Val Arg Val He Asn Leu Leu Gin Lys Pro Ser Ala Gly 
50 55 60 

Lys He Thr He Asp Asp Asp Val He Phe Asp Gly Lys Val Thr Leu 
65 70 75 80 

Thr Ala Glu Gin Leu Arg Arg Lys Arg Gin Asp He Gly Met He Phe 
85 90 95 

Gin His Phe Asn Leu Met Ser Gin Lys Thr Ala Glu Glu Asn Val Ala 
100 105 110 



Val Ala Lys Leu Leu Asp Leu Val Gly Leu Ala Asp Arg Ala Glu Asn 
130 135 140 



Ser Ala Leu Asp Pro Lys Thr Thr Lys Gin He Leu Ala Leu Leu Gin 
180 185 190 



Gly His Leu He Glu Glu Gly Ser Val Leu Glu He Phe Ser Asn Pro 

225 230 235 240 

Lys Gin Pro Leu Thr Gin Asp Phe He Ser Thr Ala Thr Gly He Asp 

245 250 255 

Glu Ala Met Val Lys He Glu Lys Gin Glu He Val Glu His Leu Ser 
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Glu Asn Ser Leu Leu Val Gin Leu Gin Val Arg Trp Ser Phe Asr. Arg 
275 280 285 



(2) INFORMATION FOR SEQ ID NO: 14 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

Arq Asp Val Asn Phe Glu He Glu Lys Gly Glu Leu Val He He Leu 
1 5 10 15 

Gly Ala Ser Gly Ala Gly Lys Ser Thr Val Leu Asn Leu Leu Gly Gly 
20 25 30 

Met Asp Thr Asn Asp Glu Gly Glu He Trp lie Asp Gly Val Asn He 
35 40 45 

Ala Asp Tyr Ser Ser His Gin Arg Thr Asn Tyr Arg Arg Asn Asp Val 
50 55 60 

Gly Phe Val Phe Gin Phe Tyr Asn Leu Val Ser Asn Leu Thr Ala Lys 
65 70 75 80 

Glu Asn Val Glu Leu Ser Glu He Val Thr Asp Ala Leu Asn Ser Asp 
85 90 95 

Gin Val Leu Thr Asp Val Gly Leu Ala His Arg Leu Asn Asn Phe Pro 
100 105 HO 

Ala Gin Leu Ser Gly Gly Glu Gin Gin Arg Val Ser lie Ala Arg Ala 
115 120 125 

Val Ala Lys Asn Pro Lys He Leu Leu Cys Asp Glu Pro Thr Gly Ala 
130 135 140 

Leu Asp Tyr Gin Thr Gly Lys Gin Val Leu Lys He Leu Gin Asp Met 
145 150 155 160 

Ser Arg Gin Lys Gly Ala Thr Val He He Val Thr His Asn Gly Ala 
165 170 175 

Leu Ala Pro He Ala Asp Arg Val He Gin Met His Asp Ala Ser Val 
180 185 190 
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Lys Asp Val Val Leu Asn Gin His Pro Gin Asp He Asp 
195 200 205 



(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not releva 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Met He Glu Leu Lys Asn He Thr Lys Thr He Gly Gly Lys Val He 
15 10 15 

Leu Asp Asn Leu Ser Leu Arg He Asp Gin Gly Asp Leu Val Ala He 
20 25 30 

Val Gly Lys Ser Gly Ser Gly Lys Ser Thr Leu Leu Asn Leu Leu Gly 
35 40 45 

Leu He Asp Gly Asp Tyr Ser Gly Arg Tyr Glu He Phe Gly Gin Thr 
50 55 60 

Asn Leu Ala Val Asn Ser Ala Lys Ser Gin Thr He He Arg Glu His 
65 70 75 80 

He Ser Tyr Leu Phe Gin Asn Phe Ala Leu He Asp Asp Glu Thr Val 
85 90 95 

Glu Tyr Asn Leu Met Leu Ala Leu Lys Tyr Val Lys Leu Pro Lys Lys 
100 105 HO 

Asp Lys Leu Lys Lys Val Glu Glu He Leu Glu Arg Val Gly Leu Ser 
115 120 125 

Ala Thr Leu His Gin Arg Val Ser Glu Leu Ser Gly Gly Glu Gin Gin 
130 135 140 

Arg He Ala Val Ala Arg Ala He Leu Lys Pro Ser Gin Leu He Leu 
145 150 155 160 

Ala Asp Glu Pro Thr Gly Ser Leu Asp Pro Glu Asn Arg Asp Leu Val 
165 170 175 

Leu Lys Phe Leu Leu Glu Met Asn Arg Glu Gly Lys Thr Val He He 
180 185 190 
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(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:150: 

Ala Lys Pro Lys Gin Leu Ser Gly Gly Gin Lys Gin Arg Val Ala lie 
15 10 15 

Ala Arg Ala Leu Ser Met Asn Pro Asp Ala lie Leu Phe Asp Glu Pro 
20 25 30 

Thr Ser Ala Leu Asp Pro Glu Met Val Gly Glu Val Leu Lys lie Met 
35 40 45 

Gin Asp Leu Ala Gin Glu Gly Leu Thr Met He Val Val Thr His Glu 
50 55 60 

Met Glu Phe Ala Arg Asp Val Ser His Arg Val He Phe Met Asp Lys 
65 70 75 80 

Gly Val He Pro 

(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Tyr Tyr Gly Asp Tyr His Ala Leu Arg Asn He Asn Leu Arg Phe Glu 
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Lys Gly Glr. Val Val Val Leu Leu Gly Pro Ser Gly Ser Gly Lys Ser 
20 25 30 

Thr Leu He Arg Thr He Asn Gly Leu Glu Ala Val Asp Lys Gly Ser 
35 40 45 

Leu Leu Val Asn Gly His Gin Val Ala Gly Ala Ser Gin Lys Asp Leu 
50 55 60 

Val Pro Leu Arg Lys Glu Val Gly Met Val Phe Gin His Phe Asn Leu 

65 70 75 80 

Tyr Pro His Lys Thr Val Leu Glu Asn Val Thr Leu Ala Pro He Lys 
85 90 95 

Val Leu Gly He Asp Lys Lys Glu Ala Glu Lys Thr Ala Gin Lys Tyr 
100 105 110 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:152: 

Met Thr Lys Lys Gin Leu His Leu Val He Val Thr Gly Met Gly Gly 
15 10 15 

Ala Gly Lys Thr Val Ala He Gin Ser Phe Glu Asp Leu Gly Tyr Phe 
20 25 30 

Thr He Asp Asn Met Pro Pro Ala Leu Leu Pro Lys Phe Leu Gin Leu 
35 40 45 

Val Glu He Lys Glu Asp Asn Pro Lys Leu Ala Leu Val Val Asp Met 
50 55 60 

Arg Ser Arg Ser Phe Phe Ser Glu He Gin Ala Val Leu Asp Glu Leu 
65 70 75 80 

Glu Asr. Gin Asp Gly Leu Asp Phe Lys He Leu Phe Leu Asp Ala Ala 
85 90 95 

Asp Lys Glu Leu Val Ala Arg Tyr Lys Glu Thr Arg Arg Ser His Pro 
100 105 110 



WO 98/26072 



PCT/US97/22578 



Leu Leu Ala Pro Leu Lys Asr. Met Ser Gin Asr. Val Val Asp Thr Thr 
130 135 140 

Glu Leu Thr Pro Arg Glu Leu Arg Lys 
145 150 

Asp Gin Glu Gin Ala Gin Ser Phe Arg lie Glu Val Met Ser Phe Gly 
165 170 175 

Phe Lys Tyr Gly He Pro He Asp Ala Asp Leu Val Phe Asp Val Arg 
180 185 190 

Phe Leu Pro Asn Pro Tyr Tyr Leu Pro Glu Leu Arg Asr. Gin Thr Gly 
195 200 205 

Val Asp Glu Pro Val Tyr Asp Tyr Val Met Asn His Pro Glu Ser Glu 
210 215 220 

Asp Phe Tyr Gin His Leu Leu Ala Leu He Glu Pro He 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Met Leu Glu Asn Asp He Lys Lys Val Leu Val Ser His Asp Glu He 
15 10 15 

Thr Glu Ala Ala Lys Lys Leu Gly Ala Gin Leu Thr Lys Asp Tyr Ala 

20 25 30 

Gly Lys Asn Pro He Leu Val Gly He Leu Lys Gly Ser He Pro Phe 
35 40 45 

Met Ala Glu Leu Val Lys His He Asp Thr His He Glu Met Asp Phe 
50 55 60 

Met Met Val Ser Ser Tyr His Gly Gly Thr Ala Ser Ser Gly Val He 
65 70 75 80 

Asn He Lys Gin Asp Val Thr Gin Asp He Lys Gly Arg His Val Leu 
85 90 95 

Phe Val Glu Asp He He Asp Thr Gly Gin Thr Leu Lys Asn Leu Arg 
100 105 110 

Asp Met Phe Lys Glu Arg Glu Ala Ala Ser Val Lys He Ala Thr Leu 
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Leu Asp Lys Pro Glu Gly Arg Val Val Glu lie Glu Ala Asp Tyr Thr 
130 135 140 

5 

Cvs Phe Tr.r He Pro Asn Glu Phe Val Val Gly Tyr Gly Leu Asp Tyr 
145 150 155 160 

Lys Glu Asn Tyr Arg Asn Leu Pro Tyr He Gly Val Leu Lys Glu Glu 
10 ' 165 170 175 

Val Tyr Ser Asn 
180 

15 (2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 193 amino acids 

(B) TYPE: amino acid 

2 0 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
25 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

3 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Met Lys He Gly He Leu Ala Leu Gin Gly Ala Phe Ala Glu His Ala 
15 10 15 

3 5 Lys Val Leu Asp Gin Leu Gly Val Glu Ser Val Glu Leu Arg Asn Leu 

20 25 30 

Asp Asp Phe Gin Gin Asp Gin Ser Asp Leu Ser Gly Leu He Leu Pro 
35 40 45 

40 

Gly Gly Glu Ser Thr Thr Met Gly Lys Leu Leu Arg Asp Gin Asn Met 
50 55 * 60 

Leu Leu Pro He Arg Glu Ala He Leu Ser Gly Leu Pro Val Phe Gly 
45 65 70 75 80 

Thr Cys Ala Gly Leu He Leu Leu Ala Lys Glu He Thr Ser Gin Lys 
85 90 95 

5 0 Glu Ser His Leu Gly Thr Met Asp Met Val Val Glu Arg Asn Ala Tyr 

100 105 110 

Gly Arg Gin Leu Gly Ser Phe Tyr Thr Glu Ala Glu Cys Lys Gly Val 
115 120 125 

55 

Gly Lys He Pro Met Thr Phe He Arg Gly Pro He He Ser Ser Val 
130 135 140 

Gly Glu Gly Val Glu He Leu Ala He Val Asn Asn Gin He Val Ala 
60 145 150 155 160 
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Ala Gin Glu Lys Asr. Met Leu Va 1 Ser Ser Phe His Pro Glu Leu Thr 
165 170 175 

Asp Asp Val Arg Leu His Gin Tyr Phe He Asn Met Cys Lys Glu Lys 



10 (2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

2 0 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:155: 

Glu Ser Glu Val Leu Ser Pro Ala Asp Asp Arg Phe His Val Asp Lys 
15 10 15 

Lys Glu Phe Gin Val Pro Phe Val Cys Gly Ala Lys Asp Leu Gly Glu 

20 25 30 

Ala Leu Arg Arg He Ala Glu Gly Ala Ser Met He Arg Thr Lys Gly 
35 4 0 45 

Glu Pro Gly Thr Gly Asp He Val Gin Ala Val Arg His Met Arg Met 
50 55 60 

Met Asn Gin Glu He Arg Arg He Gin Asn Leu Arg Glu Asp Glu Leu 
65 70 75 80 

Tyr Val Ala Ala Lys Asp Leu Gin Val Pro Val Glu Leu Val Gin Tyr 
85 90 95 

Val His Glu His Gly Lys Leu Pro Val Val Asn Phe Ala Ala Gly Gly 

100 105 110 

Val Ala Thr Pro Ala Asp Ala Ala Leu Met Met Gin Leu Gly Ala Glu 
115 120 125 

Gly Val Phe Val Gly Ser Gly He Phe Lys Ser Gly Asp Pro Val Lys 
130 135 ~ 140 

Arg Ala Ser Ala He Val Lys Ala Val Thr Asn Phe Arg Asn Pro Gin 
145 150 155 160 

He Leu Ala Gin He Ser Glu Asp Leu Gly Glu Ala Met Val Gly He 
165 170 175 

Asn Glu Asn He Gin He Leu Met Ala Glu Arg Gly Lys 
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180 185 
(2) INFORMATION FOR SEQ ID NO: 156: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

10 

(li) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Asp Lys Gly Trp Phe Val Leu Gin Thr Tyr Ser Gly Tyr Glu Asn Lys 
15 10 15 

Val Lys Glu Asn Leu Leu Gin Arg Ala Gin Thr Tyr Asn Met Leu Asp 
20 25 30 

Asn lie Leu Arg Val Glu He Pro Thr Gin Thr Val Gin Val Glu Lys 
35 40 45 

Asn Gly Lys Arg Lys Glu Val Glu Glu Asn Arg Phe Pro Gly Tyr Val 
50 55 60 

Leu Val Glu Met Val Met Thr Asp Glu Ala Trp Phe Val Val Arg Asn 

65 70 75 80 

Ala Gin Ser Pro Thr Lys Phe He Ser Glu Gin Thr Ala Tyr Glu He 
85 90 95 



Leu Leu Lys Tyr Glu Thr Leu Asp Ser Thr Gin He Lys Ala Leu Tyr 
130 135 140 



55 (2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 08 amino acids 

(B) TYPE: amino acid 

6 0 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 
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(11) MOLECULE TYPE: peptide 
(ill) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Val Asn Ser Ser Ser Val Pro Gly Asp Arg Phe Ser Val Leu Leu Glu 
15 10 15 

His Lys Gly He His Pro He Val Tyr He Ser Lys Met Asp Leu Leu 
20 25 30 

Glu Asp Arg Gly Glu Leu Asp Phe Tyr Gin Gin Thr Tyr Gly Asp He 
35 40 45 

Gly Tyr Asp Phe Val Thr Ser Lys Glu Glu Leu Leu Ser Leu Leu Thr 
50 55 60 

Gly Lys Val Thr Val Phe Met Gly Gin Thr Gly Val Gly Lys Ser Thr 
65 70 75 80 

Leu Leu Asn Lys He Ala Pro Asp Leu Asn Leu Glu Thr Gly Glu He 
85 90 95 

Ser Asp Ser Leu Gly Arg Gly Arg His Thr Thr Arg Ala Val Ser Phe 
100 105 110 



Leu Asp Tyr Glu Val Ser Arg Ala Glu Asp Leu Asn Gin Ala Phe Pro 
130 135 140 

Glu He Ala Thr Val Ser Arg Asp Cys Lys Phe Arg Thr Cys Thr His 
145 150 155 160 

Thr His Glu Pro Ser Cys Ala Val Lys Pro Ala Val Glu Glu Gly Val 
165 170 175 

He Ala Thr Phe Arg Phe Asp Asn Tyr Leu Gin Phe Leu Ser Glu He 
180 185 190 

Glu Asn Arg Arg Glu Thr Tyr Lys Lys Val Ser Lys Lys He Pro Lys 
195 200 205 



(2) INFORMATION FOR SEQ ID NO: 15 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevai 

(D) TOPOLOGY: not relevant 

di) MOLECULE TYPE: peptide 
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(iii) HYPOTHETICAL: NO 
liv) ANTI-SENSE: NO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

Gin Gin Ser Val Lys Lys Lys Val Leu Pro Ala He Glu Arg Arg He 
10 1 5 10 15 

Arg Thr Glu Leu Thr Glu Lys Ala Glu Glu Gly Ala He Gin Leu Phe 
20 25 30 

15 Ser Asp Asn Leu Arg Asn Leu Leu Leu Val Ala Pro Leu Lys Gly Arg 

35 40 45 

Val Val Leu Gly Phe Asp Pro Ala Phe Arg Thr Gly Ala Lys Leu Ala 
50 55 60 

20 

Val Val Asp Ala Thr Gly Lys Met Leu Thr Thr Gin Val He Tyr Pro 
65 70 75 80 

Val Lys Pro Ala Ser Ala Arg Gin He Glu Glu Ala Lys Lys Asp Leu 
25 85 90 95 

Ala Asp Leu He Gly Gin Tyr Gly Val Glu He He Ala He Gly Asn 
100 105 110 

3 0 Gly Thr Ala Ser Arg Glu Ser Glu Ala Phe Val Ala Glu Val Leu Lys 

115 120 125 



: Pro Glu Val Ser '. 



Val Glu Lys Arg i 
Leu Ala ( 
Tyr Gin 1 



Leu Ala Arg ( 
He Ser He i 



; Phe Pro Asp : 
; Arg Leu Gin j 



i Val Lys 

) 

) Val Ser 



. Asp Thr Val Val 

) 

) Ala Leu Leu Ser 
230 

i He Val Lys Tyr i 
245 

i He Lys Lys Val : 
260 

i Gly Phe Leu Arg : 



. Gly Gin 
i Asp Phe 
t Thr Ala 



His Val Ala < 



. Glu Gly 
250 



Asn Lys Thr : 
Lys He Thr , 



; Glu Gin 
I 

i Asp Asn 
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Gly Val His Fro Glu Asn Tyr Thr Ala Val Lys Leu Phe Lys Arg 



He Leu He 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE : peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

Met Phe Arg Ala Ala Met Ala Asn Gin Thr Glu Met Gly Val Leu Ala 
15 10 15 

Lys Ser Tyr He Asp Lys Gly Glu Leu Val Pro Asp Glu Val Thr Asn 
20 25 30 

Gly He Val Lys Glu Arg Leu Ser Gin Asp Asp He Lys Glu Thr Gly 
35 40 45 

Phe Leu Leu Asp Gly Tyr Pro Arg Thr lie Glu Gin Ala His Ala Leu 
50 55 60 

Asp Lys Thr Leu Ala Glu Leu Gly lie Glu Leu Glu Gly He He Asn 
65 70 75 80 

lie Glu Val Asn Pro Asp Ser Leu Leu Glu Arg Leu Ser Gly Arg lie 
85 90 95 
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Val Asp Tyr Lys Glu Glu Asp Tyr Tyr Gin Arg Glu Asp Asp Lys Pro 
115 120 125 



He He Ala His Tyr Arg Ala Lys Gly Leu Val His Asp He Glu Gly 
145 150 155 160 

Asn Gin Asp He Asn Asp Val Phe Ser Asp lie Glu Lys Val Leu Thr 



2 0 (2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: not releva 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

30 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

Met He Glu Phe Glu Lys Pro Asn He Thr Lys He Asp Glu Asn Lys 
15 10 15 

Asp Tyr Gly Lys Phe Val He Glu Pro Leu Glu Arg Gly Tyr Gly Thr 
20 25 30 

Thr Leu Gly Asn Ser Leu Arg Arg Val Leu Leu Ala Ser Leu Pro Gly 
35 40 45 

Ala Ala Val Thr Ser lie Asn He Asp Gly Val Leu His Glu Phe Asp 
50 55 60 

Thr Val Pro Gly Val Arg Glu Asp Val Met Gin He He Leu Asn He 
65 70 75 80 

Lys Gly He Ala Val Lys Ser Tyr Val Glu Asp Glu Lys He He Glu 
85 90 95 

Leu Asp Val Glu Gly Pro Ala Glu Val Thr Ala Gly Asp He Leu Thr 
100 105 110 
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(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 335 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

Glu Tyr Leu Gly Ala Thr Val Gin Val He Pro His He Thr Asp Ala 
15 10 15 

Leu Lys Glu Lys He Lys Ser Ala Ala Leu Thr Thr Asp Ser Asp Val 
20 25 30 

He He Thr Glu Val Gly Gly Thr Val Gly Asp He Glu Ser Leu Pro 
35 40 45 

Phe Leu Glu Ala Leu Arg Gin Met Lys Ala Asp Val Gly Ala Asp Asn 
50 55 60 

Val Met Tyr He His Thr Thr Leu Pro Tyr Leu Lys Ala Ala Gly Glu 
65 70 75 80 

Met Lys Lys Pro Thr Gin His Ser Val Lys Leu Arg Gly Leu Gly He 
85 90 95 



He Lys Asn Lys Leu Ala Gin Phe Cys Asp Val Ala Pro Glu Ser Leu 

115 120 125 

He Glu Ser Leu Asp Val Glu His Leu Tyr Gin He Pro Leu Asn Leu 

130 135 140 

Gin Ala Gin Gly Met Asp Gin lie Val Cys Asp His Leu Lys Leu Asp 

145 150 155 ' 160 



Ala Pro Ala Ala Asp Met Thr Glu Trp Ser Ala Met Val Asp Lys Val 
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Met Asn Leu Lys Lys Gin Val Lys He Ser Leu Val Gly Lys Tyr Val 
180 ' 185 190 



He He Val Pro Gly Gly Phe Gly Gin Arg Gly Thr Glu Gly Lys He 
245 250 255 

Gin Ala He Arg Tyr Ala Arg Glu Asn Asp Val Pro Met Leu Gly Val 
260 265 270 



Gly Thr Leu Arg Leu Gly Leu Tyr Pro Ser Lys Leu Lys Arg Leu 
325 330 " 335 

(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 301 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Met Ser Glu Lys Leu Val Glu He Lys Asp Leu Glu He Ser Phe Gly 
1 5 10 15 

Glu Gly Ser Lys Lys Phe Val Ala Val Lys Asn Ala Asn Phe Phe He 
20 25 30 

Asn Lys Gly Glu Thr Phe Ser Leu Val Gly Glu Ser Gly Ser Gly Lys 
35 40 45 

Thr Thr He Gly Arg Ala He He Gly Leu Asn Asp Thr Ser Asn Gly 
50 55 60 
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Asp lie lie Phe Asp Gly Gin Lvs lie Asn Gly Lys Lys Ser Arg Glu 
65 70 75 80 

Gin Ala Ala Glu Leu He Arg Arg He Glr. Met He Phe Gin Asp Pro 
85 SO 95 

Ala Ala Ser Leu Asn Glu Arg Ala Thr Val Asp Tyr He He Ser Glu 
100 105 HO 

Gly Leu Tvr Asn His Arg Leu Phe Lys Asp Glu Glu Glu Arg Lys Glu 
115 120 125 

Lys Val Gin Ser He He Arg Glu Val Gly Leu Leu Ala Glu His Leu 
130 135 140 

Thr Arg Tyr Pro His Glu Phe Ser Gly Gly Gin Arg Gin Arg He Gly 
145 150 155 160 

He Ala Arg Ala Leu Val Met Gin Pro Asp Phe Val He Ala Asp Glu 
165 170 175 

Pro He Ser Ala Leu Asp Val Ser Val Arg Ala Gin Val Leu Asn Leu 
180 185 190 

Leu Lys Lys Phe Gin Lys Glu Leu Gly Leu Thr Tyr Leu Phe He Ala 
195 200 205 

His Asp Leu Ser Val Val Arg Phe He Ser Asp Arg He Ala Val He 
210 215 220 

Tyr Lys Gly Val He Val Glu Val Ala Glu Thr Glu Glu Leu Phe Asn 
225 230 235 240 

Asn Pro He His Pro Tyr Thr Gin Ala Leu Leu Ser Ala Val Pro He 
245 250 255 

Pro Asp Pro He Leu Glu Arg Lys Lys Val Leu Lys Val Tyr Asp Pro 
260 265 270 

Ser Gin His Asp Tyr Glu Thr Asp Lys Pro Ser Met Val Glu He Arg 
275 280 285 

Pro Gly His Tyr Val Trp Ala Asn Gin Ala Glu Leu Ala 

290 295 300 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 151 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
!D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

Gin lie Gin Lys Ser Phe Lys Gly Gin Ser Pro Tyr Gly Lvs Leu Tyr 
15 10 15 

Leu Val Ala Thr Pro He Gly Asn Leu Asp Asp Met Thr Phe Arg Ala 
20 25 30 

He Gin Thr Leu Lys Glu Val Asp Trp He Ala Ala Glu Asp Thr Arg 
35 40 45 

Asn Thr Gly Leu Leu Leu Lys His Phe Asp He Ser Thr Lys Gin lie 
50 55 60 

Ser Phe His Glu His Asn Ala Lys Glu Lys He Pro Asp Leu He Gly 
65 70 75 80 

Phe Leu Lys Ala Gly Gin Ser He Ala Gin Val Ser Asp Ala Gly Leu 
85 90 95 

Pro Ser He Ser Asp Pro Gly His Asp Leu Val Lys Ala Ala He Glu 
100 105 110 

Glu Glu He Ala Val Val Thr Val Pro Gly Ala Ser Ala Gly He Ser 
115 120 125 

Ala Leu He Ala Ser Gly Leu Ala Pro Gin Pro His He Phe Tyr Gly 
130 135 140 



(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 258 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY : not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Ser Arg Lys Asp Lys Gin Glu Arg He Ser Lys Glu Thr Met Glu lie 
15 10 15 

Tyr Ala Pro Leu Ala His Arg Leu Gly lie Ser Ser Val Lys Trp Glu 
20 25 30 

Leu Glu Asp Leu Ser Phe Arg Tyr Leu Asn Pro Thr Glu Phe Tyr Lvs 
35 40 45 
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Ile Thr His Met Met Lys Glu Lvs Arg Arg Glu Arg Glu Ala Leu Val 
50 55 60 

Asp Glu Val Val Thr Lys Leu Glu Glu Tyr Thr Thr Glu Arg His Leu 
65 70 75 80 

Lys Gly Lys He Tvr Gly Arg Pro Lys His He Tyr Ser He Phe Arg 
85 90 95 

Lys Met Gin Asp Lys Arg Lys Arg Phe Glu Glu He Tyr Asp Leu He 
100 105 110 



Asn Trp He Lys Glu Met Met Glu Leu Gin Asp Gin Ala Asp Asp Ala 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 289 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

Thr Lys Val Gly Gly Glu Ala Asp Tyr Leu Val Phe Pro Arg Asn Arg 
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Phe Glu Leu Ala Arg Val Val Lys Phe Ala Asn Glr. Glu Asr. lie Pro 
20 ' 25 30 

Trp Met Val Leu Gly Asr. Ala Ser Asr. He He Val Arg Asp Gly Gly 
35 40 45 

He Arg Gly Phe Val He Leu Cys Asp Lys Leu Asn Asn Val Ser Val 
50 55 60 

Asp Gly Tyr Thr He Glu Ala Glu Ala Gly Ala Asn Leu He Glu Thr 
65 70 75 80 

Thr Arg He Ala Leu Arg His Ser Leu Thr Gly Phe Glu Phe Ala Cys 
85 90 95 

Gly He Pro Gly Ser Val Gly Gly Ala Val Phe Met Asn Ala Gly Ala 
100 105 110 

Tyr Gly Gly Glu He Ala His He Leu Gin Ser Cys Lys Val Leu Thr 
115 120 125 



Tyr Arg His Ser Ala He Gin Glu Ser Gly Ala Val Val Leu Ser Val 
145 150 155 160 

Lys Phe Ala Leu Ala Pro Gly Thr His Gin Val He Lys Gin Glu Met 
165 170 175 

Asp Arg Leu Thr His Leu Arg Glu Leu Lys Gin Pro Leu Glu Tyr Pro 
180 185 190 

Ser Cys Gly Ser Val Phe Lys Arg Pro Val Gly His Phe Ala Gly Gin 
195 200 205 



Val Ser Glu Lys His Ala Gly Phe Met He Asn Val Ala Asp Gly Thr 
225 230 235 240 

Ala Lys Asp Tyr Glu Asp Leu He Gin Ser Val He Glu Lys Val Lys 
245 250 255 

Glu His Ser Gly He Thr Leu Glu Arg Glu Val Arg He Leu Gly Glu 
260 265 270 

Ser Leu Ser Val Ala Lys Met Tyr Ala Gly Gly Phe Thr Pro Cys Lys 
275 280 285 



(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevai 
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(D) TOPOLOGY: not relevant 
(ii) MOLECULE TYPE : peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Ala Lys Arg Arg Lys Leu Val Lys Ser Thr Thr Leu Leu Leu Ala Cys 
1 5 10 15 

Leu Gin Lys Pro Phe Leu Thr Thr Leu Leu Pro Thr lie Trp He Cys 
20 25 30 

Val Lys Ser Ser Met Phe Thr Leu Leu Arg Leu Asn Thr Trp He Lys 
35 40 45 

Asp Phe His Ser Pro Ser Ser Cys Val Val Thr Phe Gin Lys Ala Phe 
50 55 60 

Thr Asn Gly Arg Gly Lys He Asn Lys Arg His Val Thr Cys Pro Ser 
65 70 75 80 

Phe Val Thr Met Pro Leu Thr Arg Glu Ser Ser Leu Ser Thr Thr Ser 
85 " 90 95 

Val Pro Leu Gin Met Thr Val Glu Lys Ser Ala Pro Thr Asn Val Lys 
100 105 110 



(2) INFORMATION FOR SEQ ID NO:167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 253 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

Met Leu Lys Gin Glu Lys Leu Ala Lys He Leu Glu He Val Asr. Ser 
15 10 15 

Lys Gly Thr He Thr Val Lys Gin He Met Asp Glu He Ala Val Ser 
20 25 30 

Asp Met Thr Ala Axg Arg Tyr Leu Gin Glu Leu Ala Asp Lys Asp Leu 
35 40 45 
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Leu lie Arg Val His Gly Gly Ala Glu Lys Leu Arg Thr Asn Ser Leu 
50 55 60 

Leu Thr Asn Glu Arg Ser Asn He Glu Lys Gin Ala Leu Gin Thr Ala 
65 70 75 80 

Glu Lys Gin Glu He Ala His Phe Ala Gly Ser Leu Val Glu Glu Arg 
85 90 95 



Glu Leu Pro He Asp Asn He Arg Val Val Thr Asn Ser Leu Pro Val 
115 120 125 



40 (2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 320 amino acids 

(B) TYPE: amino acid 

45 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Met Glu Thr Tyr Tyr Lys Ala He Asn Trp Asn Ala He Glu Asp Val 
15 10 15 

60 He Asp Lys Ser Thr Trp Glu Lys Leu Thr Glu Gin Phe Trp Leu Asp 
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Thr Arg lie Pro Leu Ser Asn Asp Leu Asp Asp Trp Arg Lys Leu Ser 
35 40 45 

Asn Lys Glu Lys Asp Leu Val Gly Lys Val Phe Gly Gly Leu Thr Leu 
50 55 60 

Leu Asp Thr Met Gin Ser Glu Thr Gly Val Gin Ala Leu Arg Ala Asp 
65 70 75 BO 

lie Arg Thr Pro His Glu Glu Ala Val Phe Asn Asn He Gin Phe Met 
85 90 95 



Thr Lys Ala Glu He Glu Glu He Phe Glu Trp Thr Asn Thr Asn Pro 
115 120 125 

Tyr Leu Gin Lys Lys Ala Glu He Val Asn Glu He Tyr Leu Asn Gly 
130 135 140 

Ser Pro Leu Glu Lys Lys Val Ala Ser Val Phe Leu Glu Thr Phe Leu 
145 150 155 160 

Phe Tyr Ser Gly Phe Phe Thr Pro Leu Tyr Tyr Leu Gly Asn Asn Lys 
165 170 175 

Leu Ala Asn Val Ala Glu He He Lys Leu He He Arg Asp Glu Ser 
180 185 190 

Val His Gly Thr Tyr He Gly Tyr Lys Phe Gin Leu Gly Phe Asn Glu 
195 200 205 

Leu Pro Glu Glu Glu Gin Glu Lys Leu Lys Glu Trp Met Tyr Asp Leu 
210 215 220 

Leu Tyr Thr Leu Tyr Glu Asn Glu Glu Gly Tyr Thr Glu Ser Leu Tyr 
225 230 235 240 

Asp Gly Val Gly Trp Thr Glu Glu Val Lys Thr Phe Leu Arg Tyr Asn 
245 250 255 

Ala Asn Lys Ala Leu Met Asn Met Gly Gin Asp Pro Leu Phe Pro Asp 
260 265 270 

Ser Ala Glu Asp Val Asn Pro He Val Met Asn Gly He Ser Thr Gly 
275 280 285 

Thr Ser Asn His Asp Phe Phe Ser Gin Val Gly Asn Gly Tyr Leu Leu 
290 295 300 

Gly Glu Val Glu Ala Met Gin Asp Asp Asp Tyr Asn Tyr Gly Leu Asp 
305 310 315 320 



(2) INFORMATION FOR SEQ ID NO:169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 240 amino acids 

(B) TYPE: amino acid 
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(ii) MOLECULE TYPE: peptide 

(lii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

He Glu Glu Gly Val Lys Val Val Thr Thr Gly Ala Gly Asn Pro Ser 
15 10 15 

Lys Tyr Met Glu Arg Phe His Glu Ala Gly He He Val He Pro Val 
20 25 30 

Val Pro Ser Val Ala Leu Ala Lys Arg Met Glu Lys He Gly Ala Asp 
35 40 45 

Ala Val He Ala Glu Gly Met Glu Ala Gly Gly His He Gly Lys Leu 
50 55 60 

Thr Thr Met Thr Leu Val Arg Gin Val Ala Thr Ala Val Ser He Pro 
65 70 75 80 

Val He Ala Ala Gly Gly He Ala Asp Gly Glu Gly Ala Ala Ala Gly 
85 90 95 



Ala Lys Glu Ser Asn Ala His Pro Asn Tyr Lys Glu Lys He Leu Lys 
115 120 125 



(2) INFORMATION FOR SEQ ID N0:170: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 24 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

Met Lys Leu Glu His Lys Asn He Phe He Thr Gly Ser Ser Arg Gly 
15 10 15 

He Gly Leu Ala He Ala His Lys Phe Ala Gin Ala Gly Ala Asn He 
20 25 30 

Val Leu Asn Ser Arg Gly Ala He Ser Glu Glu Leu Leu Ala Glu Phe 
35 40 45 

Ser Asn Tyr Gly He Lys Val Val Pro He Ser Gly Asp Val Ser Asp 
50 55 60 

Phe Ala Asp Ala Lys Arg Met He Asp Gin Ala He Ala Glu Leu Gly 
65 70 75 80 

Ser Val Asp Val Leu Val Asn Asn Ala Gly He Thr Gin Asp Thr Leu 
85 90 95 



Leu Thr Gly Ala Phe Asn Met Thr Gin Ser Val Leu Lys Pro Met Met 
115 120 125 



He Leu Ser Asp Lys He Lys Glu Ala Thr Leu Ala Gin He Pn 
195 200 205 



Leu Ser Met 
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(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 171 : 

Met Thr Lys Thr Ala Phe Leu Phe Ala Gly Gin Gly Ala Gin Tyr Leu 
15 10 15 

Gly Met Gly Arg Asp Phe Tyr Asp Gin Tyr Pro He Val Lys Glu Thr 
20 25 30 

He Asp Arg Ala Ser Gin Val Leu Gly Tyr Asp Leu Arg Tyr Leu He 
35 40 45 

Asp Thr Glu Glu Asp Lys Leu Asn Gin Thr Arg Tyr Thr Gin Pro Ala 
50 55 60 

He Leu Ala Thr Ser Val Ala He Tyr Arg Leu Leu Gin Glu Lys Gly 
65 70 75 80 

Tyr Gin Pro Asp Met Val Ala Gly Leu Ser Leu Gly Glu Tyr Ser Ala 
85 90 95 

Leu Val Ala Ser Gly Ala Leu Asp Phe Glu Asp Ala Val Ala Leu Val 
100 105 HO 

Ala Lys Arg Gly Ala Tyr Met Glu Glu Ala Ala Pro Ala Asp Ser Gly 
115 120 125 

Lys Met Val Ala Val Leu Asn Thr Pro Val Glu Val He Glu Glu Ala 
130 135 140 

Cys Gin Lys Ala Ser Glu Leu Gly Val Val Thr Pro Ala Asn Tyr Asn 
145 150 155 160 

Thr Pro Ala Gin He Val He Ala Gly Glu Val Val Ala Val Asp Arg 
165 170 175 

Ala Val Glu Leu Leu Gin Glu Ala Gly Ala Lys Arg Leu He Pro Leu 
180 185 190 

Lys Val Ser Gly Pro Phe His Thr Ser Leu Leu Glu Pro Ala Ser Gin 
195 200 205 

Lys Leu Ala Glu Thr Leu Ala Gin Val Ser Phe Ser Asp Phe Thr Cys 
210 215 220 

Pro Leu Val Gly Asn Thr Glu Ala Ala Val Met Gin Lys Glu Asp He 
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Ala Gin Leu Leu Thr Arg Gin Val Lys Glu Pro Val Arg Phe Tyr Glu 
245 250 255 



Ala His Leu Ala His Val Glu Asp Gin Ala Ser Leu Val Ala Leu Leu 
290 295 ' 300 



(2) INFORMATION FOR SEQ ID NO:172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 318 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:172: 

Met Lys Leu Asn Arg Val Val Val Thr Gly Tyr Gly Val Thr Ser Pro 
15 10 15 

He Gly Asn Thr Pro Glu Glu Phe Trp Asn Ser Leu Ala Thr Gly Lys 
20 25 30 

He Gly He Gly Gly He Thr Lys Phe Asp His Ser Asp Phe Asp Val 
35 40 45 

His Asn Ala Ala Glu He Gin Asp Phe Pro Phe Asp Lys Tyr Phe Val 
50 55 60 

Lys Lys Asp Thr Asn Arg Phe Asp Asn Tyr Ser Leu Tyr Ala Leu Tyr 
65 70 75 80 

Ala Ala Gin Glu Ala Val Asn His Ala Asn Leu Asp Val Glu Ala Leu 
85 90 95 

Asn Arg Asp Arg Phe Gly Val He Val Ala Ser Gly He Gly Gly He 
100 105 110 
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Gly Asn Val Ala Met Arg Phe Gly Ala Asn Gly Val Cys Lys Ser lie 
145 150 155 160 

Asn Thr Ala Cvs Ser Ser Ser Asn Asp Ala He Gly Asp Ala Phe Arg 
165 170 175 

Ser He Lys Phe Gly Phe Gin Asp Val Met Leu Val Gly Gly Thr Glu 
180 185 190 

Ala Ser He Thr Pro Phe Ala He Ala Gly Phe Gin Ala Leu Thr Ala 
195 200 205 

Leu Ser Thr Thr Glu Asp Pro Thr Arg Ala Ser He Pro Phe Asp Lys 
210 215 220 

Asp Arg Asn Gly Phe Val Met Gly Glu Gly Ser Gly Met Leu Val Leu 
225 230 235 240 

Glu Ser Leu Glu His Ala Glu Lys Arg Gly Ala Thr He Leu Ala Glu 
245 250 255 

Val Val Gly Tyr Gly Asn Thr Cys Asp Ala Tyr His Met Thr Ser Pro 
260 265 270 

His Pro Glu Gly Gin Gly Ala He Lys Ala He Lys Leu Ala Leu Glu 
275 280 285 

Glu Ala Glu He Ser Pro Glu Gin Val Ala Met Leu Met Leu Thr Glu 
290 295 300 

Arg Gin Leu Leu Pro Met Lys Lys Glu Lys Val Val Leu Ser 
305 310 315 

(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

Met Gin Ala Val Glu His Phe He Lys Gin Phe Val Pro Glu His Tyr 
15 10 15 

Asp Leu Phe Leu Asp Leu Ser Arg Glu Thr Lys Thr Phe Ser Gly Lys 
20 25 30 

Val Thr He Thr Gly Gin Ala Gin Ser Asp Arg He Ser Leu His Gin 
35 40 45 

Lys Asp Leu Glu He Thr Ser Val Glu Val Ala Gly Gin Ala Arg Pro 
50 55 60 
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Phe Thr Val Asp His Asp Asr. Glu Ala Leu His lie Glu Leu Ala Glu 
£5 70 75 80 

Ala Gly Gin Val Glu Leu Val Leu Ala Phe Ser Gly Lys He Thr Asp 
85 90 95 

Asn Met Thr Gly He Tyr Pro Ser Tyr Tyr Thr Val Asp Gly Val Lys 
100 105 110 

Lys Glu Val Leu Ser Thr Gin Phe Glu Ser His Phe Ala Arg Glu Ala 
115 120 125 



Leu Arg Phe Asp Gin Ala Glu Gly Glu Leu Ala Leu Ser Asn Met Pro 
145 150 155 160 



Met Lys Trp Trp Asp Asp Leu Trp Leu Asn Glu Ser Phe Ala Asn Met 
305 310 315 320 
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(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 amino acids 
<B] TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(li) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(IV) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

Met Asp Phe Leu Leu Phe Tyr Asp Ser Lys Lys Lys Gly Asp Thr Met 
15 10 15 

Thr Tyr Leu Glu Lys Trp Phe Asp Phe Asn Arg Arg Gin Lys Glu lie 
20 25 30 

Glu Ser Leu Leu Glu Glu Thr lie Ala Gin Gin Ser Glu Gin Ser Leu 
35 40 45 

Thr Leu Lys Glu Phe Tyr Leu Leu Tyr Tyr Leu Asp Leu Ala Glu Glu 
50 55 60 

Lys Ser Leu Arg Gin lie Asp Leu Pro Asp Lys Leu His Leu Ser Pro 
65 70 75 80 

Ser Ala Val Ser Arg Met Val Ala Arg Leu Glu Ala Lys Asn Cys Gly 
85 90 95 

Leu Leu Ser Arg Met Cys Cys His Gin Asp Arg Arg Ser Ser Phe lie 
100 105 110 



45 (2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
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Met Leu Tyr Asp Tyr Gly Asn Ser Val Trp Leu Ala Ser Met Gly Thr 
15 10 15 

lie Gly Gin Thr Val Leu Gly Met Tyr Gin He Ser Glu Leu Val Thr 
20 25 30 

Ser He Leu Val Asn Pro Phe Gly Gly Val He Ser Asp Arg Phe Ser 
35 40 45 

Arg Arg Lys He Leu Met Thr Ala Asp Leu Val Cys Gly He Leu Cys 
50 55 60 

Leu Ala He Ser Phe lie Arg Asn Asp Ser Trp Met He Gly Ala Leu 
65 70 75 80 

He Val Ala Asn He Val Gin Ala He Ala Phe Ala Phe Ser Arg Thr 
85 90 95 

Ala Asn Lys Ala He lie Thr Glu Val Val Glu Lys Asn Glu lie Val 
100 105 110 

He Tyr Asn Ser Arg Leu Glu Leu Val Leu Gin Val Val Gly Val Ser 
115 120 125 

Ser Pro Val Leu Ser Phe Leu Val Leu Gin Phe Ala Ser Leu His Met 
130 135 140 

Thr Leu Leu Leu Asp Ser Leu Thr Phe Phe He Ala Phe Val Leu Val 
145 150 155 160 

Ala Phe Leu Pro Lys Glu Glu Ala Lys Val Gin Glu Lys Lys Ala Phe 
165 170 175 

Thr Gly Arg Asp He Phe Val Asp He Lys Asp Gly Leu His Tyr He 
180 185 190 

Trp His Gin Gin Glu He Phe Phe Leu Leu Leu Val Ala Ser Ser Val 
195 200 205 

Asn Phe Phe Phe Ala Ala Phe Glu Phe Leu Leu Pro Phe Ser Asn Gin 
210 215 220 

Leu Tyr Gly Ser Glu Gly Ala Tyr Ala Ser He Leu Thr Met Gly Ala 
225 " 230 235 240 

He Gly Ser He He Gly Ala Leu Leu Ala Ser Lys He Lys Ala Asn 
245 250 255 

lie Tyr Asn Leu Leu He Leu Leu Ala Leu Thr Gly Val Gly Val Phe 
260 265 270 

Met Met Gly Leu Pro Leu Pro Thr Phe Leu Ser Phe Ser Gly Asn Leu 
275 280 285 

Val Cys Glu Leu Phe Met Thr He Phe Asn He His Phe Phe Thr Gin 
290 295 300 

Val Gin Thr Lys Val Glu Ser Glu Phe Leu Gly Arg Val Leu Ser Thr 
305 310 315 320 

He Phe Thr Leu Ala He Leu Phe Met Pro He Ala Lys Gly Phe Met 
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Val His Leu Ser Ser Phe Leu lie lie Gly Ser 

345 350 

Gly Val lie lie Leu Ser Cys He Ser Phe He Tyr Val Arg Thr His 

355 360 365 



(2) INFORMATION FOR SEQ ID NO: 17 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:176: 

Met Ser Val Ser Phe Glu Asn Lys Glu Thr Asn Arg Gly Val Leu Thr 
15 10 15 

Phe Thr He Ser Gin Asp Gin He Lys Pro Glu Leu Asp Arg Val Phe 
20 25 30 

Lys Ser Val Lys Lys Ser Leu Asn Val Pro Gly Phe Arg Lys Gly His 
35 40 45 

Leu Pro Arg Pro He Phe Asp Gin Lys Phe Gly Glu Glu Ala Leu Tyr 
50 55 60 

Gin Asp Ala Met Asn Ala Leu Leu Pro Asn Ala Tyr Glu Ala Ala Val 
65 70 75 80 

Lys Glu Ala Gly Leu Glu Val Val Ala Gin Pro Lys He Asp Val Thr 
85 90 S5 

Ser Met Glu Lys Gly Gin Asp Trp Val He Thr Ala Glu Val Val Thr 
100 105 HO 

Lys Pro Glu Val Lys Leu Gly Asp Tyr Lys Asn Leu Glu Val Ser Val 
115 " 120 125 

Asp Val Glu Lys Glu Val Thr Asp Ala Asp Val Glu Glu Arg He Glu 
130 * 135 140 

Arg Glu Arg Asn Asn Leu Ala Glu Leu Val He Lys Glu Ala Ala Ala 
145 150 155 160 

Glu Asr. Gly Asp Thr Val Val He Asp Phe Val Gly Ser He Asp Gly 
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Val Glu Phe Asp Gly Gly Lys Gly Glu Asn Phe Ser Leu Gly Leu Gly 
180 185 190 



Ala Gly Glu Thr Val Asp Val lie Val Thr Phe Pro Glu Asp Tyr Gin 
210 215 220 

Ala Glu Asp Leu Ala Gly Lys Glu Ala Lys Phe Val Thr Thr lie His 

225 ' 230 235 240 

Glu Val Lys Ala Lys Glu Val Pro Ala Leu Asp Asp Glu Leu Ala Lys 

245 250 255 



Arg Lys Glu Leu Ala Ala Ala Lys Glu Glu Thr Tyr Lys Asp Ala Val 
275 280 285 

Glu Gly Ala Ala He Asp Thr Ala Val Glu Asn Ala Glu He Val Glu 
2S0 295 300 

Leu Pro Glu Glu Met He His Glu Glu Val His Arg Ser Val Asn Glu 
305 310 315 320 

Phe Leu Gly Asn Leu Gin Arg Gin Gly He Asn Pro Asp Met Tyr Phe 
325 330 335 



Glu Ala Glu Ser Arg Thr Lys Thr Asn Leu Val He Glu Ala Val Ala 

355 360 365 

Lys Ala Glu Gly Phe Asp Ala Ser Glu Glu Glu He Gin Lys Glu Val 
370 375 380 



Leu Leu Ser Ala Asp Met Leu Lys His Asp He Thr He Lys Lys Ala 
405 410 415 

Val Glu Leu lie Thr Ser Thr Ala Thr Val Lys 
420 425 

(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 07 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

Gly Gly Asp Lys Asp Phe Leu Thr Ser lie Cys Leu Thr Asr. Asp Pro 
15 10 15 

Phe Leu Gly Phe Arg Ala Leu Arg He Ser He Ser Glu Thr Gly Asp 
20 25 30 

Ala Met Phe Arg Thr Gin He Arg Ala Leu Leu Arg Ala Ser Val His 
35 40 45 

Gly Gin Leu Arg He Met Phe Pro Met Val Ala Leu Leu Lys Glu Phe 
50 55 60 

Arg Ala Ala Lys Ala Val Phe Asp Glu Glu Lys Ala Asn Leu Leu Ala 
65 " 70 75 80 

Glu Gly Val Ala Val Ala Asp Asn He Gin Val Gly He Met He Glu 
85 90 95 

He Pro Ala Ala Ala Met Leu Ala Asp Gin Phe Ala Lys Glu Val Asp 
100 105 110 

Phe Phe Ser He Gly Thr Asn Asp Leu He Gin Tyr Thr Met Ala Ala 
115 " 120 125 

Asp Arg Met Asn Glu Gin Val Ser Tyr Leu Tyr Gin Pro Tyr Asn Pro 
130 135 140 

Ser He Leu Arg Leu He Asn Asn Val He Lys Ala Ala His Ala Glu 
145 150 155 160 

Gly Lys Trp Ala Gly Met Cys Gly Glu Met Ala Gly Asp Gin Gin Ala 
165 170 175 

Val Pro Leu Leu Val Gly Met Gly Leu Asp Glu Phe Ser Met Ser Ala 
180 185 190 

Thr Cys Thr Ser Tyr Thr Gin Leu Asp Glu Glu Thr Arg His Ser 
195 " 200 205 

(2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 283 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 8: 

Met Glr. Met Ala Tyr Arg Cys Asn Leu Arg Asn Asr. Gly Lys Arg Arg 
15 10 15 

lie Gly He Arg Glu Met Thr Glu Met Leu Lys Gly He Ala Ala Ser 
20 25 30 

Asp Gly Val Ala Val Ala Lys Ala Tyr Leu Leu Val Gin Pro Asp Leu 

35 40 45 

Ser Phe Glu Thr He Thr Val Glu Asp Thr Asn Ala Glu Glu Ala Arg 
50 55 60 

Leu Asp Ala Ala Leu Gin Ala Ser Gin Asp Glu Leu Ser Val He Arg 
65 70 75 80 

Glu Lys Ala Val Gly Thr Leu Gly Glu Glu Ala Ala Gin Val Phe Asp 
85 90 95 

Ala His Leu Met Val Leu Ala Asp Pro Glu Met He Ser Gin He Lys 
100 105 110 



Val Thr Asp Met Phe He Thr He Phe Glu Gly Met Glu Asp Asn Pro 

130 135 140 

Tyr Met Gin Glu Arg Ala Arg Asp He Arg Asp Val Thr Lys Arg Val 

145 150 155 160 

Leu Ala Asn Leu Leu Gly Lys Lys Leu Pro Asn Pro Ala Ser He Asn 

165 " 170 175 



Gin Leu Asp Lys Asn Phe Val Lys Ala Phe Val Thr Asn He Gly Gly 
195 200 205 



Val Leu Gly Thr Asn Asn He Thr Glu He Val Lys Asp Gly Asp He 

225 230 235 240 

Leu Ala Val Asn Gly He Thr Gly Glu Val He He Asn Pro Thr Asp 
245 250 255 

Glu Gin Ala Ala Glu Phe Lys Ala Ala Gly Glu Ala Tyr Ala Thr Lys 
260 265 270 

Ala Glu Trp Ala Leu Leu Lys Asp Ala Gin Gin 
275 280 

(2) INFORMATION FOR SEQ ID NO: 17 9: 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

Met lie Gly Arg Leu Ala Pro Tyr Asp Lys Gly Gin lie lie Tyr Asp 
15 10 15 

Gly Thr Ser Leu Lys Asp lie Lys Pro Ser Val Phe Phe Arg Asp Tyr 
20 25 30 

Leu Gly Tyr Leu Phe Gin Asp Phe Gly Leu He Glu Ser Gin Thr Val 
35 40 45 

Lys Glu Asn Leu Asn Leu Gly Leu Val Gly Lys Lys Leu Lys Glu Lys 
50 55 60 

Glu Lys He Ser Leu Met Lys Gin Ala Leu Asn Arg Val Asn Leu Ser 
65 70 75 80 

Tyr Leu Asp Leu Lys Gin Pro He Phe Glu Leu Ser Gly Gly Glu Ala 
85 90 95 



Leu Ala Asp Glu Pro Thr Ala Ser Leu Asp Pro Lys Asn Ser Glu Glu 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:180: 

Met Lys Ala His Val Ser Tyr Leu Ser Ket Gly Glu Lys Arg Phe Val 
15 10 15 

Tyr Asn Asn Gly Glu Asn Pro Val Ser Thr Gin Tyr Leu Thr Asp Pro 
20 25 30 

He Leu Val Val Phe Thr Pro Thr Ser Thr Gly Asp Ser Phe He Ser 
35 40 45 

Leu Ser Ser Trp Ser He Asn Ala Gly Lys Gin Leu Phe He Lys Gly 
50 55 60 

Tyr Glu Ser Gly Leu Glu Leu Leu Lys Lys Ala Gly He Tyr Glu Gin 
65 70 75 80 

Val Ser Tyr Leu Lys Glu Gly Arg Ser Val Tyr Leu Thr Arg Tyr Asn 
85 90 95 

Glu Val Gin Thr Glu Thr Ala Thr Leu He Leu Gly Ala He Val Gly 
100 105 HO 

He Ala Ser Ser Leu Leu Leu Phe Tyr Ser Val Asn Leu Leu Tyr Phe 
115 120 125 

Glu Gin Phe Arg Arg Asp He Leu He Lys Arg He Ser Gly Leu Arg 
130 135 140 

Phe Phe Glu Thr His Ala Gin Tyr Met Val Ser Gin Phe Ala Ser Phe 
145 150 155 160 

Val Phe Gly Ala Ser Leu Phe He Leu Ser Ser Arg Asp Leu Val He 
165 170 175 

Gly Leu Leu Thr Leu Leu Val Phe Leu Ala Ser Ala Val Leu Thr Leu 
180 185 190 

Tyr Arg Gin Ala Gin Lys Glu Ser Arg Val Ser Met Thr He Met Lys 
195 200 205 



45 (2) INFORMATION FOR SEQ ID NO:181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 amino acids 

(B) TYPE: amino acid 

5 0 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:181: 
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-270- 

Glu Phe Gin Glu Ala Ser Glr. Glu Ser Arg Glu Arg Ser Asp Pro Leu 
15 10 15 

Asn Ser Tyr Leu Leu Leu Ser Gly Ser Leu Thr Lys Glu Lys Leu Ala 
20 25 30 

Asp Lys Leu Gly Asp Leu Gly Tyr Lys Ala Ser Ala Asp Arg Lys He 
35 40 45 

Pro Pro Tyr Phe Leu Ala Phe Arg He Leu Leu Asn Pro Leu He Leu 
50 55 60 

He Ser Leu Ala He Phe Gly Leu Ser Phe Phe Ala Leu Val He He 
65 70 75 B0 

Thr Arg He Lys Glu Met Arg Ala Ala Gly He Lys Leu Phe Ser Gly 
85 90 95 

Gin Thr Leu Leu Ser He Met Gly His Ser Leu Ser Thr Asp He Lys 
100 105 110 

Trp Leu Leu Leu Ser Ala Leu Leu Ser Phe Leu Gly Gly Gly Val Val 
115 120 125 

Leu Phe Ser Gin Gly Leu Phe Tyr Pro He Leu Leu Ala Thr Tyr Gly 
130 135 140 

Phe Gly He Ser Phe Tyr Leu Leu Phe Leu Leu Ala He Ser He Leu 
145 150 155 160 

Leu Met Leu Leu Tyr Leu Met Ser Leu Asn Lys Ala Leu Val Pro Val 
165 170 175 

He Arg Gly Arg Phe Pro Leu Leu Met Thr Leu Phe Gin Pro Val Phe 
180 185 190 

Ser Val Gly Tyr Ala Lys Thr Gly Leu Thr Ser Tyr Gin Arg Leu Lys 
195 200 205 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
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Met Ser Lys Asp Lys Lys Asn Glu Asp Lys Glu Tkr Leu Glu Glu Leu 
15 10 15 

Lys Glu Leu Ser Glu Trp Gin Lys Arg Asn Gin Glu Tyr Leu Lys Lys 
20 25 30 

Lys Ala Glu Glu Glu Val Ala Leu Ala Glu Glu Lys Glu Lys Glu Arg 
35 40 45 

Gin Ala Arg Met Gly Glu Glu Ser Glu Lys Ser Glu Asp Lys Gin Asp 
50 55 60 

Gin Glu Ser Glu Thr Asp Gin Glu Asp Ser Glu Ser Ala Lys Glu Glu 
65 70 75 80 

Ser Glu Glu Lys Val Ala Ser Ser Glu Ala Asp Lys Glu Lys Glu Glu 
85 90 95 



Ala Thr Lys Glu Lys Pro Ala Lys Ala Lys He Pro Gly He His He 
115 120 125 



Tyr Glu Lys Gin He Lys Ser Asn Tyr Trp Val Glu Ser Ala Gin Leu 
195 200 205 



He Val Ala Tyr Tyr He Ser Gly Glu Asn His Tyr Pro He Leu Ser 
225 230 235 240 



Val Ser Glu Leu Ala Gin He Ser Pro Glu Leu Lys Ala Ala He Gin 
275 280 285 

Lys Val Glu Leu Ala Pro Ser Lys Val Thr Ser Asp Leu He Arg Leu 
290 295 300 
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Lys Lys Leu Pro Tyr Tyr Ser Lys He Lys Pro Gin Leu Ser Glu Pri 
225 330 335 



Lys Leu He Met Glu Ala Glu Glu Lys Ala Lys Gin Glu Ala Lys Glu 

355 360 365 

Ala Glu Lys Lys Gin Glu Glu Glu Gin Lys Lys Gin Glu Glu Glu Ser 

370 375 380 



(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 165 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:183: 

Met Glu Arg Val Val Asp He Leu Lys Ala Glu Phe Asp Arg Ser Phe 
15 10 15 

Lys Leu He Asn Ser Lys Thr Tyr Pro Val Ser Gly Gly Glu Leu Asn 
20 25 30 

Pro Ala Asn Val Asp Ser Glu He Glu Ala Phe Ala Gin Leu Gly Val 
35 40 45 

Ser Arg Gly Leu Asp Ser Lys Glu Ala His Tyr Leu Ala Asn Leu Tyr 
50 55 60 

Gly Ser Asn Ala Pro Lys Val Phe Ala Leu Ala His Ser Leu Glu Gin 
65 70 75 80 

Ala Pro Gly Leu Ser Leu Ala Asp Thr Leu Ser Leu His Tyr Ala Met 
85 90 95 

Arg Asn Glu Leu Ala Leu Ser Pro Val Asp Phe Leu Leu Arg Arg Thr 
100 105 110 

Asn His Met Leu Phe Met Arg Asp Ser Leu Asp Ser He Val Glu Pro 
115 120 125 

Val Leu Asp Glu Met Gly Arg Phe Tyr Asp Trp Thr Glu Glu Glu Lys 
130 135 140 

Ala Thr Tyr Arg Ala Asp Val Glu Ala Ala Leu Ala Asn Asn Asp Leu 
145 150 155 160 
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Ala Glu Leu Lys Asn 
165 

(2) INFORMATION FOR SEQ ID NO:ie4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:184: 

Met Asn Glu Leu Phe Gly Glu Phe Leu Gly Thr Leu He Leu He Leu 
1 5 10 15 

Leu Gly Asn Gly Val Val Ala Gly Val Val Leu Pro Lys Thr Lys Ser 
20 25 30 

Asn Ser Ser Gly Trp He Val He Thr Met Gly Trp Gly He Ala Val 
35 40 45 

Ala Val Ala Val Phe Val Ser Gly Lys Leu Ser Pro Ala His Leu Asn 
50 55 60 

Pro Ala Val Thr He Gly Val Ala Leu Lys Gly Gly Leu Pro Trp Ala 
65 70 75 ' 80 

Ser Val Leu Pro Tyr He Leu Ala Gin Phe Ala Gly Ala Met Leu Gly 
85 90 95 

Gin He Leu Val Trp Leu Gin Phe Lys Pro His Tyr Glu Ala Glu Glu 
100 105 HO 

Asn Ala Gly Asn He Leu Ala Thr Phe Ser Thr Gly Pro Ala He Lys 
115 120 125 

Asp Thr Val Ser Asn Leu He Ser Glu He Leu Gly Thr Phe Val Leu 
130 135 140 

Val Leu Thr He Phe Ala Leu Gly Leu Tyr Asp Phe Gin Ala Gly He 
145 150 155 160 

Gly Thr Phe Ala Val Gly Thr Leu He Val Gly He Gly Leu Ser Leu 
165 170 175 

Gly Gly Thr Thr Gly Tyr Ala Leu Asn Pro Ala Arg Asp Leu Gly Pro 
180 185 190 

Arg He Met His Ser He Leu Pro He Pro Asn Lys Gly Asp Gly Asp 
195 200 205 

Trp Ser Tyr Ala Trp He Pro Val Val Gly Pro Val He Gly Ala Ala 



PCT/US97/22578 



(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Thr Thr Asp Asn Val lie Asp Leu Phe Glu His He Phe Lys Met Phe 
15 10 15 

Asn Glu Asn He Val Met Ala Gly Lys Val Asn Leu Leu Asn Phe Ala 
20 25 30 

Asn Leu Ala Ala Tyr Gin Phe Phe Asp Gin Pro Gin Lys Val Ala Leu 
35 40 45 

Glu He Arg Glu Gly Leu Arg Glu Asp Gin Met Gin Asn Val Arg Val 
50 55 60 

Ala Asp Gly Gin Glu Ser Cys Leu Ala Asp Leu Ala Val He Ser Ser 
65 70 75 80 

Lys Phe Leu He Pro Tyr Arg Gly Val Gly He Leu Ala He He Gly 
85 90 95 

Pro Val Asn Leu Asp Tyr Gin Gin Leu He Asn Gin He Asn Val Val 
100 105 110 



(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 153 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not releva; 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Met He Ala Lys Glu Phe Glu Thr Phe Leu Leu Gly Gin Glu Glu Thr 
15 10 15 

Phe Leu Thr Pro Ala Lys Asn Leu Ala Val Leu He Asp Thr His Asn 
20 25 30 

Ala Asp His Ala Thr Leu Leu Leu Ser Gin Met Thr Tyr Thr Arg Val 
35 40 45 

Pro Val Val Thr Asp Glu Lys Gin Phe Val Gly Thr He Gly Leu Arg 
50 55 60 

Asp He Met Ala Tyr Gin Met Glu His Asp Leu Ser Gin Glu He Met 
65 70 75 80 

Ala Asp Thr Asp He Val His Met Thr Lys Thr Asp Val Ala Val Val 
85 90 95 

Ser Pro Asp Phe Thr He Thr Glu Val Leu His Lys Leu Val Asp Glu 
100 105 110 

Ser Phe Leu Pro Val Val Asp Ala Glu Gly He Phe Gin Gly He He 
115 120 125 

Thr Arg Lys Ser He Leu Lys Ala Val Asn Ala Leu Leu His Asp Phe 
130 135 140 

Ser Lys Glu Tyr Glu He Arg Cys Gin 
145 150 

(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Met Ala Lys Gin Thr He lie Val Met Ser Asp Ser His Gly Asp Ser 
1 5 10 15 

Leu He Val Glu Glu Val Arg Asp Arg Tyr Val Gly Lys Val Asp Ala 
20 25 30 

Val Phe His Asn Gly Asp Ser Glu Leu Arg Pro Asp Ser Pro Leu Trp 
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Glu Gly He Arg Val Val Lys Gly Asn Met Asp Phe Tyr Ala Gly Tyr 
50 55 60 

Pro Glu Arg Leu Val Thr Glu Leu Gly Ser Thr Lys He He Gin Thr 
65 70 75 80 

His Gly His Leu Phe Asp He Asn Phe Asn Phe Gin Lys Leu Asp Tyr 

85 90 95 

Trp Ala Gin Glu Glu Glu Ala Ala He Cys Leu Tyr Gly His Leu His 
100 105 " 110 



Ser He Ser Gin Pro Arg Gly Thr He Arg Glu Cys Leu Tyr Ala Arg 
130 135 140 

Val Glu He Asp Asp Ser Tyr Phe Lys Val Asp Phe Leu Thr Arg Asp 
20 145 150 155 160 

His Glu Val Tyr Pro Gly Leu Ser Lys Glu Phe Ser Arg 
165 170 

25 (2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

3 0 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
35 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

40 (xi) SEQUENCE DESCRIPTION: SEQ IDNO:188: 

Met Ser Thr Leu Ala Lys He Glu Ala Leu Leu Phe Val Ala Gly Glu 
15 10 15 

4 5 Asp Gly He Arg Val Arg Gin Leu Ala Glu Leu Leu Ser Leu Pro Pro 

20 25 30 

Thr Gly He Gin Gin Ser Leu Gly Lys Leu Ala Gin Lys Tyr Glu Lys 
35 4 0 45 

50 

Asp Pro Asp Ser Ser Leu Ala Leu He Glu Thr Ser Gly Ala Tyr Arg 
50 55 60 

Leu Val Thr Lys Pro Gin Phe Ala Glu He Leu Lys Glu Tyr Ser Lys 
55 65 70 75 80 

Ala Pro He Asn Gin Ser Leu Ser Arg Ala Ala Leu Glu Thr Leu Ser 
85 90 95 

60 He He Ala Tyr Lys Gin Pro He Thr Arg He Glu He Asp Ala He 

100 105 110 



WO 98/26072 



PCT/US97/22578 



Arg Gly Val Asn Ser Ser C-ly Ala Leu Ala Lys Leu Gin Ala Phe Asp 
115 120 125 



Glu Glu Leu Pro Val He Asp Glu Leu Glu He Gin Ala Gin Glu Se 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

Met Arg Asp Arg He Ser Ala Phe Leu Glu Glu Lys Gin Gly Leu Ser 
15 10 15 

Val Asn Ser Lys Gin Ser Tyr Lys Tyr Asp Leu Glu Gin Phe Leu Asp 
20 25 30 

Met Val Gly Glu Arg He Ser Glu Thr Ser Leu Lys He Tyr Gin Ala 
35 40 45 

Gin Leu Ala Asn Leu Lys He Ser Ala Gin Lys Arg Lys He Ser Ala 

50 55 60 

Cys Asn Gin Phe Leu Tyr Phe Leu Tyr Gin Lys Gly Glu Val Asp Ser 
65 70 75 60 

Phe Tyr Arg Leu Glu Leu Ala Lys Gin Ala Glu Lys Lys Thr Glu Lys 
85 90 95 

Pro Glu He Leu Tyr Leu Asp Ser Phe Trp Gin Glu Ser Asp His Pro 
100 105 110 

Glu Gly Arg Leu Leu Ala Leu Leu He Leu Glu Met Gly Leu Leu Pro 
115 120 125 

Ser Glu He Leu Ala He Lys Val Ala Asp He Asn Leu Asp Phe Gin 
130 135 140 

Val Leu Arg He Ser Lys Ala Ser Gin Gin Arg He Val Thr He Pro 
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Thr Ala Leu Leu Ser Glu Le 
165 

Phe Glu Arg Gly Glu Lys Pro Tyr Ser Arg Gin Trp Ala Phe Arg Gin 
180 185 190 

Leu Glu Ser Phe Val Arg Arg Arg Phe Pro Ser Leu Ser Ala Gin Val 



15 (2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 amino acids 

(B) TYPE: amino acid 

2 0 (C) STRANDEDNESS : not relevant 

( D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
25 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE : NO 

3 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 0: 

Met Arg He Asn Lys Tyr He Ala His Ala Gly Val Ala Ser Arg Arg 
15 10 15 

3 5 Lys Ala Glu Glu Leu He Lys Gin Gly Leu Val Thr Val Asn Gly Gin 

20 25 30 

Val Val Arg Glu Leu Ala Thr Thr He Lys Ser Gly Asp Lys Val Glu 
35 40 45 

40 

Val Glu Gly Gin Pro He Tyr Asn Glu Glu Lys Val Tyr Tyr Leu Leu 
50 55 60 

Asn Lys Pro Arg Gly Val He Ser Ser Val Thr Asp Asp Lys Gly Arg 
45 65 70 75 80 

Lys Thr Val Val Asp Leu Leu Pro Asn Val Lys Glu Arg He Tyr Pro 
85 90 95 

50 v al Gly Arg Leu Asp Trp Asp Thr Ser Gly Val Leu He Leu Thr Asn 

100 105 HO 

Asp Gly Asp Phe Thr Asp Glu Met He His Pro Arg Asn Glu He Asp 
115 120 125 

Lys Val Tyr Val Ala Arg Val Lys Gly Val Ala Asn Lys Asp Asn Leu 
130 135 140 

Arg Pro Leu Thr Arg Gly Leu Glu He Asp Gly Lys Lys Thr Lys Pro 
6° 145 150 155 160 
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Ala Val Tyr Glu lie Leu Lys Val Asp Pro Val Lys Asn Arg Ser Val 
165 170 175 

Val Gin Leu Thr lie His Glu Gly Arg Asn His Gin Val Lys Lys Met 
180 185 190 

Phe Glu Ala Val Gly Leu Gin Val Asp Lys Leu Ser Arg Thr Arg Phe 
195 200 205 

Gly His Leu Asp Leu Thr Leu Arg Pro Gly Glu Ser Arg Arg Leu Asn 
210 215 220 

Lys Lys Glu He Ser Gin Leu His Thr Met Ala Val Thr Lys Lys 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191 : 

Met Asp He Lys Leu Lys Arg Phe Leu Lys Asp Pro Gly Leu Ala Leu 
1 5 10 15 

Cys He Trp Phe Leu Ser Thr Lys Met Asp He Tyr Asp Val Pro He 
20 25 30 

Thr Glu Val He Glu Gin Tyr Leu Ala Tyr Val Ser Thr Leu Gin Ala 
35 40 45 

Met Arg Leu Glu Val Thr Gly Glu Tyr Met Val Met Ala Ser Gin Leu 
50 55 60 

Met Leu He Lys Ser Arg Lys Leu Leu Pro Lys Val Ala Glu Val Thr 
65 70 75 80 

Asp Leu Gly Asp Asp Leu Glu Gin Asp Leu Leu Ser Gin He Glu Glu 
85 90 95 

Tyr Arg Lys Phe Lys Leu Leu Gly Glu His Leu Glu Ala Lys His Gin 
100 105 110 

Glu Arg Ala Gin Tyr Tyr Ser Lys Ala Pro Thr Glu Leu He Tyr Glu 
115 120 125 

Asp Ala Glu Leu Val His Asp Lys Thr Thr He Asp Leu Phe Leu Ala 
130 135 140 

Phe Ser Asn He Leu Ala Lys Lys Lys Glu Glu Phe Ala Gin Asn His 
145 150 155 160 
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Thr Thr He Leu Arg Asp Glu Tyr Lys He Glu Asp Met Met He He 

165 170 175 

5 Val Lys Glu Ser Leu He Gly Arg Asp Gin Leu Arg Leu Gin Asp Leu 

180 185 190 

Phe Lys Glu Ala Gin Asn Val Gin Glu Val He Thr Leu Phe Leu Ala 
195 200 205 

10 

Thr Leu Glu Leu He Lys Thr Gin Glu Leu He Leu Val Gin Glu Glu 
210 215 220 

Ser Phe Gly Asp He Tyr Leu Met Glu Lys Lys Glu Glu Ser Gin Val 
15 225 230 235 240 

Pro Gin Ser 

20 (2) INFORMATION FOR SEQ ID NO:192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 6 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

30 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Met Ala Gly Lys Arg Asp Ser Cys Gly Ala Cys Arg He Met Thr Asn 
1 5 10 15 

4 0 Lys He Tyr Glu Tyr Lys Asp Asp Gin Asn Trp Tyr Val Gly Ser Tyr 

20 25 30 

Ser He Phe Gly Gly Val Asn Ser Leu Ser Asp Tyr Lys Ala Asp Phe 
35 40 45 

45 

Pro Leu Phe Glu Phe Ser Lys He Phe Gly Asp Glu Glu Tyr Gly Phe 
50 55 60 

Pro Leu Ser Val Thr Val Leu Arg Tyr Gly Ser Thr Tyr Arg Leu Phe 
50 65 70 75 80 

Ser Phe Val Val Asp Met Leu Asn Gin Glu Met Gly Arg Asn Leu Glu 
85 90 95 

55 Val He Gin Arg His Gly Ala Leu Leu Leu Val Glu Asn Gly Gin Leu 

100 105 110 

Leu Tyr Val Glu Leu Pro Lys Glu Gly Val Asn Val His Asp Phe Phe 

115 120 125 

60 

Glu Thr Ser Lys Val Arg Glu Thr Leu Leu He Ala Thr Arg Asn Glu 
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Gly Lys Thr Lys Glu Phe Arg Ala He Pke Asp Lys Leu Gly Tyr Asp 

145 150 155 160 

Val Glu Asn Leu Asn Asp Tvr Pro Asp Leu Pro Glu Val Ala Glu Thr 

165 no 175 



Gin Leu Thr Gly Lys Met Val Leu Ala Asp Asp Ser Gly Leu Lys Val 
195 200 205 

Asp Val Leu Gly Gly Leu Pro Gly Val Trp Ser Ala Arg Phe Ala Gly 
210 215 220 

Val Gly Ala Thr Asp Arg Glu Asn Asn Ala Lys Leu Leu His Glu Leu 
225 " 230 235 240 

Ala Met Val Phe Glu Leu Lys Asp Arg Ser Ala Gin Phe His Thr Thr 
245 250 255 



Trp Ser Gly Tyr He Asn Phe Glu Pro Lys Gly Glu Asn Gly Phe Gly 

275 280 285 

Tyr Asp Pro Leu Phe Leu Val Gly Glu Thr Gly Glu Ser Ser Ala Glu 

290 295 300 



40 (2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 amino acids 

(B) TYPE: amino acid 

45 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Glu Asn Asn Tyr Glu Pro Gin Tyr He Asn He Arg Gly Lys Gly Pro 
15 10 15 

60 Leu He Asn Asp Leu Lys Lys Glu Ala Lys Lys Ala Asn Lys Val Phe 
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Leu Ala Ser Asp Pro Asp Arg Glu Gly Glu Ala lie Ser Trp His Leu 
35 40 45 

Ala His lie Leu Asn Leu Asp Glu Asr. Asp Ala Asn Arg Val Val Phe 
50 55 60 

Asr. Glu lie Thr Lys Asp Ala Val Lys Asr. Ala Phe Lys Glu Pro Arg 
65 70 75 80 

Lys He Asp Met Asp Leu Val Asp Ala Gin Gin Ala Arg Arg He Leu 
85 90 95 



Lys Lys Gly Leu Ser Ala Gly Arg Val Gin Ser He Ala Leu Lys Leu 
115 120 125 



Trp Thr Val Asp Ala Val Phe Lys Lys Gly Thr Lys Gin Phe His Ala 
145 150 155 160 

Ser Phe Tyr Gly Val Asp Gly Lys Lys Met Lys Leu Thr Ser Asn Asn 
165 170 175 

Glu Val Lys Glu Val Leu Ser Arg Leu Thr Ser Lys Asp Phe Ser Val 
180 185 190 

Asp Gin Val Asp Lys Lys Glu Arg Lys Ala Asn Ala Pro Leu Pro Tyr 
195 200 205 



(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 236 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Met Ser He His He Ala Ala Gin Gin Gly Glu He Ala Asp Lys He 
15 10 15 

Leu Leu Pro Gly Asp Pro Leu Arg Ala Lys Phe He Ala Glu Asn Phe 
20 25 30 

Leu Gly Asp Ala Val Cys Phe Asn Glu Val Arg Asn Met Phe Gly Tyr 
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35 40 45 

Thr Gly Tkr Tyr Lys Gly Kis Arg Val Ser Val Met Gly Tkr Gly Met 
50 55 60 

Gly Met Pro Ser lie Ser He Tyr Ala Arg Glu Leu He Val Asp Tyr 
65 70 75 80 

Gly Val Lys Lys Leu He Arg Val Gly Thr Ala Gly Ser Leu Asn Glu 
85 90 95 

Glu Val His Val Arg Glu Leu Val Leu Ala Gin Ala Ala Ala Thr Asn 
100 105 110 

Ser Asn He Val Arg Asn Asp Trp Pro Gin Tyr Asp Phe Pro Gin He 
115 120 125 

Ala Ser Phe Asp Leu Leu Asp Lys Ala Tyr His He Ala Lys Glu Leu 
130 135 140 

Gly Met Thr Thr His Val Gly Asn Val Leu Ser Ser Asp Val Phe Tyr 
145 150 155 160 

Ser Asn Tyr Phe Glu Lys Asn He Glu Leu Gly Lys Trp Gly Val Lys 
165 170 175 

Ala Val Glu Met Glu Ala Ala Ala Leu Tyr Tyr Leu Ala Ala Gin Tyr 
180 185 190 

His Val Asp Ala Leu Ala He Met Thr He Ser Asp Ser Leu Val Asn 
195 200 205 

Pro Asp Glu Asp Thr Thr Ala Glu Glu Arg Gin Asn Thr Phe Thr Asp 
210 215 220 

Met Met Lys Val Gly Leu Glu Thr Leu He Ala Glu 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 477 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

He He Phe Pro He Leu Thr Gly Thr Tyr Val Ala Arg Val Leu Asp 
1 5 10 15 

Arg Thr Asp Tyr Gly Tyr Phe Asn Ser Val Asp Thr He Leu Ser Phe 



WO 98/26072 



PCT7US97/22578 



-284- 

Phe Leu Pro Phe Ala Thr Tyr Gly Val Tyr Asn Tyr Gly Leu Arg Ala 
35 40 45 

lie Ser Asn Val Lys Asp Asn Lys Lys Asp Leu Asn Arg Thr Phe Ser 
50 55 60 

Ser Leu Phe Tyr Leu Cys lie Ala Cys Thr lie Leu Thr Thr Ala Val 
65 70 75 80 

Tyr lie Leu Ala Tyr Pro Leu Phe Phe Thr Asp Asn Pro lie Val Lys 

85 90 95 



lie Glu Trp Val Asn Glu Ala Leu Glu Asn Tyr Ser Phe Leu Phe Tyr 
115 120 125 



Val Lys Asn Glu His Asp lie Val Val Tyr Thr Leu Val Met Ser Leu 
145 150 155 160 

Ser Thr Leu lie Asn Tyr Leu lie Ser Tyr Phe Trp lie Lys Arg Asp 
165 170 175 

lie Lys Leu Val Lys lie His Leu Ser Asp Phe Lys Pro Leu Phe Leu 
180 185 190 

Pro Leu Thr Ala Met Leu Val Phe Ala Asn Ala Asn Met Leu Phe Thr 

195 200 205 



Val Thr Gly Ala lie Gly Val Ser Val Pro Arg Leu Ser Tyr Tyr Leu 
245 250 255 

Gly Lys Gly Asp Lys Glu Ala Tyr Val Ser Leu Val Asn Arg Gly Ser 
260 265 270 

Arg lie Phe Asn Phe Phe lie lie Pro Leu Ser Phe Gly Leu Met Val 
275 280 285 



Gly Gly He Leu Thr Ser Leu Phe Ala Phe Arg Thr He He Leu Ala 
305 310 315 320 



Lys Arg He Thr Val Tyr Thr Val Phe Ala Gly Leu Leu Asn Leu Gly 
340 345 350 

Leu Asn Ser Leu Leu Phe Phe Asn His He Val Ala Pro Glu Tyr Tyr 
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Phe Leu lie Asr. Phe Val Tyr Pro Val Asp Met Val He Asn Leu Pr< 
420 425 430 



lie Ser Leu Leu Val Phe Thr Lys Asp Ser He Phe Tyr Glu Phe Leu 
450 455 460 



(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:196: 

Phe Pro He Asp Arg Phe Asp Asp Pro Lys Val He Asp Thr Cys Tyr 
15 10 15 

Lys Leu Glu Ser Phe Lys Leu Leu Ser Phe Ser Lys His Lys Asn He 
20 25 30 

Val Tyr Lys Asp Ser Leu Leu Lys Asp Trp lie Arg Thr Ala Phe Trp 
35 40 45 

Leu Leu Leu Arg Pro Val Ser Pro Arg Tyr Phe Ala Asn Lys He Glu 
50 55 60 

Lys Glu He Gin Lys Tyr Ser Arg Glu Asn Gly Gin Tyr Met Ala Phe 
65 70 75 80 

He Pro Ser Lys Phe Lys Glu Lys Glu Val Phe Pro Ser Gly Thr Phe 
85 90 95 



6 0 



Asp Lys Thr He Asp Leu Pro Phe Glu Asn Leu Ser Leu Pro Ala Pro 
100 105 110 
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Glu Lys Phe Asp Thr lie Leu Thr Gin Phe Tyr Gly Asp Tyr Met Thr 
115 120 125 

Leu Pro Pro Glu Glu Lys Arg Phe Tyr Ser His Glu Phe His Ala Tyr 

130 135 140 



10 (2) INFORMATION FOR SEQ ID NO:197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 280 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

20 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:197: 

Met Asn Phe Thr Leu lie Asn Trp Arg lie Arg Met Gin Tyr Leu Glu 
15 10 15 

3 0 Lys Lys Glu lie Lys Glu lie Gin Leu Ala Leu Leu Asp Tyr lie Asp 

20 25 30 

Glu Thr Cys Lys Lys His Asp lie Pro Tyr Phe Leu Ser Tyr Gly Thr 
35 4 0 45 

35 

Met Leu Gly Ala He Arg His Lys Gly Met He Pro Trp Asp Asp Asp 
50 55 60 

He Asp He Ser Leu Tyr Arg Glu Asp Tyr Glu Arg Leu Leu Lys He 
40 65 70 75 80 

He Glu Glu Glu Asn His Pro Arg Tyr Lys Val Leu Ser Tyr Asp Thr 
85 90 95 

45 Ser Ser Trp Tyr Phe His Asn Phe Ala Ser He Leu Asp Thr Ser Thr 

100 105 110 

Val He Glu Asp His Val Lys Tyr Lys Arg His Asp Thr Ser Leu Phe 
115 120 125 

50 

He Asp Val Phe Pro He Asp Arg Phe Thr Asp Leu Ser He Val Asp 
130 135 140 

Lys Ser Tyr Lys Tyr Val Ala Leu Arg Gin Leu Ala Tyr He Lys Lys 
55 145 150 155 160 



Cys Ser Trp Tyr Ala Leu Arg Phe Val Asn Pro Arg Tyr Phe Tyr Lys 
180 185 190 
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Lys lie Asp Gin Leu Val Lys Asn Ala Val Thr Asn Thr Pro Gin Tyr 

195 200 205 

5 Glu Gly Gly Val Gly lie Gly Lys Glu Gly Met Lys Glu lie Phe Pro 

210 215 220 

Val Asp Thr Phe Lys Glu Leu lie Leu Thr Glu Phe Glu Gly Arg Met 
225 230 235 240 

10 

Leu Pro Val Pro Lys Lys Tyr Asp Gin Phe Leu Thr Glr. Met Tyr Gly 
245 " 250 255 

Asp Tyr Met Thr Pro Pro Ser Lys Glu Met Gin Glu Trp Tyr Ser His 
15 260 265 270 

Ser lie Lys Ala Tyr Arg Lys Asn 
275 280 

20 (2) INFORMATION FOR SEQ ID NO:198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 3 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

3 0 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Lys Gly Phe He Pro Trp Asp Asp Asp Leu Asp Phe Phe Met Pro Arg 
15 10 15 

4 0 Lys Asp Tyr Glu Lys Leu Ala Glu Leu Trp Pro Arg Tyr Ala Asp Glu 

20 25 30 

Arg Tyr Phe Leu Ser Lys Ser His Lys Asp Phe Val Asp Arg Asn Leu 
35 40 45 

45 

Phe He Thr He Arg Asp Lys Lys Thr Thr Cys He Lys Pro Tyr Gin 
50 55 60 

Gin Asp Leu Asp Leu Pro His Gly Leu Ala Leu Asp Val Leu Pro Leu 
50 65 70 75 80 

Asp Tyr Tyr Pro Lys Asn Pro Ala Glu Arg Lys Lys Gin Val Arg Trp 
85 90 95 

55 Ala Leu He Tyr Ser Leu Phe Cys Ala Gin Thr He Pro Glu Lys His 

100 105 110 

Gly Asp Leu Met Lys Trp Gly Ser Arg He Leu Leu Gly Leu Thr Pro 

115 120 125 

60 

Lys Ser Leu Arg Tyr Arg He Trp Lys Lys Ala Glu Lys Glu Met Thr 
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(2) INFORMATION FOR SEQ ID NO:199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 835 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

Gly Phe Asp Asp Tyr His Pro Ser Cys Gly Arg lie Leu Ser Val Val 
15 10 15 

Thr Ser Gly Gly Glu Asp He Ala Asp Ala He He He Leu Ala Val 
20 25 30 

Val He He Asn Ala Ala Phe Gly Val Tyr Gin Glu Gly Lys Ala Glu 
35 40 45 

Glu Ala He Glu Ala Leu Lys Ser Met Ser Ser Pro Val Ala Arg Val 
50 55 60 

Leu Arg Asp Gly His Met Ala Glu He Asp Ser Lys Glu Leu Val Pro 
65 70 75 80 

Gly Asp He Val Ala Leu Glu Ala Gly Asp Val Val Pro Ala Asp Leu 



Asp Ala Gly He Gly Asp Arg Val Asn Met Ala Phe Gin Asn Ser Asn 
130 135 140 
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Val Thr Tyr Gly Arg Gly Met Gly Val Val Val Asn Thr Gly Met Tyr 

145 150 155 160 

Thr Glu Val Gly His lie Ala Gly Met Leu Gin Asp Ala Asp Glu Thr 

165 170 175 



Tyr Ala He Leu Val He Ala Leu Val Thr Phe Val Val Gly Val Phe 
195 200 205 

He Gin Gly Lys Asn Pro Leu Gly Glu Leu Leu Thr Ser Val Ala Leu 
210 215 220 

Ala Val Ala Ala He Pro Glu Gly Leu Pro Ala He Val Thr He Val 
225 230 235 240 

Leu Ser Leu Gly Thr Gin Val Leu Ala Lys Arg His Ser He Val Arg 
245 250 255 

Lys Leu Pro Ala Val Glu Thr Leu Gly Ser Thr Glu He He Ala Ser 
260 265 270 

Asp Lys Thr Gly Thr Leu Thr Met Asn Lys Met Thr Val Glu Lys Val 
275 280 285 

Phe Tyr Asp Ala Val Leu His Asp Ser Ala Asp Asp He Glu Leu Gly 
290 295 300 

Leu Glu Met Pro Leu Leu Arg Ser Val Val Leu Ala Asn Asp Thr Lys 
305 310 315 320 

He Asp Val Glu Gly Asn Leu He Gly Asp Pro Thr Glu Thr Ala Phe 
325 330 335 

He Gin Tyr Ala Leu Asp Lys Gly Tyr Asp Val Lys Gly Phe Leu Glu 
340 345 350 

Lys Tyr Pro Arg Val Ala Glu Leu Pro Phe Asp Ser Asp Arg Lys Leu 
355 360 365 

Met Ser Thr Val His Pro Leu Pro Asp Ser Arg Phe Leu Val Ala Val 
370 375 380 

Lys Gly Ala Pro Asp Gin Leu Leu Lys Arg Cys Leu Leu Arg Asp Lys 
385 390 395 400 



Thr Asn Asn Ser Glu Met Ala His Gin Ala Leu Arg Val Leu Ala Gly 

420 425 430 

Ala Tyr Lys He He Asp Ser He Pro Glu Asn Leu Thr Ser Glu Glu 

435 440 445 

Leu Glu Asn Asp Leu He Phe Thr Gly Leu He Gly Met lie Asp Pro 

450 455 460 

Glu Arg Pro Glu Ala Ala Glu Ala Val Arg Val Ala Lys Glu Ala Gly 
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Met Thr Gly Asp Gly Val Asn Asp Ala Pro Ala Leu Lys Thr Ala Asp 
565 570 575 



Ser Asp Met He Leu Ala Asp Asp Asn Phe Ala Thr He He Val Ala 
595 600 605 



Thr Leu Phe Gly Trp Asp Val Leu Gin Pro Val His Leu Leu Trp He 
645 650 655 



6 0 



He Val Val Glu Pro Leu Glu Gly He Phe His Val Thr Lys Leu Asp 
785 790 795 800 
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Leu Ser Gin Trp Gly He Val Met Ala Glv Ser Phe Ser Met He He 

805 810 815 

He Val Glu He Val Lys Phe He Gin Arg Lys Leu Gly Phe Asp Lys 

820 825 830 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 525 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

Gly Phe He Leu Phe Phe Val Leu Leu Gly Ala Val Phe Glu Glu Lys 
15 10 15 

Met Arg Lys Asn Thr Ser Gin Ala Val Glu Lys Leu Leu Asp Leu Gin 
20 25 30 

Ala Lys Thr Ala Glu Val Leu Ser Asp Asp Ser Tyr Val Gin Val Pro 
35 40 45 

Leu Glu Gin Val Lys Val Gly Asp Leu He Arg Val Arg Pro Gly Glu 
50 55 60 

Lys He Ala Val Asp Gly Val Val Val Glu Gly Val Ser Ser He Asp 
65 70 75 80 

Glu Ser Met Val Thr Gly Glu Ser Leu Pro Val Asp Lys Thr Val Gly 
85 90 95 



Arg Ala Glu Lys Val Gly Ser Glu Thr Val Leu Ala Gin He Val Asp 

115 120 125 

Phe Val Lys Lys Ala Gin Thr Ser Arg Ala Pro He Gin Asp Leu Thr 

130 135 140 

Asp Lys He Ser Gly He Phe Val Pro Val Val Val He Leu Gly He 

145 150 155 160 



Leu Gly Ala Ser Phe Val Ser Ser Leu Leu Tyr Gly Val Ala Val Leu 
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Met Val Gly Thr Gly Arg Ser Ala Lys Met Gly Val Leu Leu Lys Asn 

210 215 220 

Gly Thr Val Leu Gin Glu He Gin Lys Val Gin Thr Leu Val Phe Asp 

225 230 235 240 



Met Leu Thr Gly Asp Asn Ala Gly Val Ala Arg Ala He Ala Asp Gin 
385 390 395 400 
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(2) INFORMATION FOR SEQ ID NO:201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:201: 

Asn Asp lie lie Glu Phe Met Asp Lys Asn Lys lie Met Gly Leu Thr 
1 5 10 15 

Gin Arg Glu Val Lys Glu Arg Gin Ala Glu Gly Leu Val Asn Asp Phe 
20 25 30 

Thr Ala Ser Ala Ser Thr Ser Thr Trp Gin He Val Lys Arg Asn Val 
35 40 45 

Phe Thr Leu Phe Asn Ala Leu Asn Phe Ala He Ala Leu Ala Leu Ala 
50 55 60 

Phe Val Gin Ala Trp Ser Asn Leu Val Phe Phe Ala Val He Cys Phe 
65 70 75 80 

Asn Ala Phe Ser Gly He Val Thr Glu Leu Arg Ala Lys His Met Val 
85 90 95 

Asp Lys Leu Asn Leu Met Thr Lys Glu Lys Val Lys Thr He Arg Asp 
100 105 110 

Gly Gin Glu Val Ala Leu Asn Pro Glu Glu Leu Val Leu Gly Asp Val 
115 120 125 

He Arg Leu Ser Ala Gly Glu Gin He Pro Ser Asp Ala Leu Val Leu 
130 135 140 

Glu Gly Phe Ala Glu Val Asn Glu Ala Met Leu Thr Gly Glu Ser Asp 
145 150 155 160 



Ala Ser Gly Ser Val Leu Ser Gin Val His His Val Gly Ala Asp Asn 
180 185 190 

Tyr Ala Ala Lys Leu Met Leu Glu Ala Lys Thr Val Lys Pro He Asn 
195 200 205 

Ser Arg He Met Lys Ser Leu Asp Lys Leu Ala Gly Phe Thr Gly Lys 
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He He He Pro Phe Gly Leu Ala Leu Leu Leu Glu Ala Leu Leu Leu 
225 230 235 240 



30 (2) INFORMATION FOR SEQ ID NO:202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 

3 5 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

40 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:202: 

Ala Ser Asn He Met Phe Met Leu Asp Leu Gly Asn His Leu Asp Gin 
15 10 15 

5 0 Trp Ser Leu Lys Lys Thr Ala Thr Asp Leu Glu Gin Ser Leu Leu Ala 

20 ' 25 30 

Lys Glu Ser Asp Val Phe Leu Val Gin Gly Asp Thr Val Val Ser He 
35 40 45 

55 

Lys Ser Ser Asp Val Gin He Gly Asp Val Leu He Leu Ser Gin Gly 
50 55 60 

Asn Glu He Leu Phe Asp Gly Gin Val Val Ser Gly Leu Gly Met Val 
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Asn Glu Ser Ser Leu Thr Gly Glu Ser Phe Pro Val Glu Lys Arg Glu 



lie Arg Val Thr Asp Asn Gin Met Asn Ser Arg He Leu Gin Leu He 

115 120 125 

Glu Leu Met Lys Lys Ser Glu Glu Asn Lys Lys Thr Lys Gin Arg Tyr 

130 135 140 

Phe He Lys Met Ala Asp Lys Val Val Lys Tyr Asn Phe Leu Gly Ser 

145 150 155 160 

Gly Leu Thr Tyr Leu Leu Thr Gly Ser Phe Ser Lys Ala He Ser Phe 

165 170 175 

Leu Leu Val Asp Phe Ser Cys Ala Leu Lys He Ser Thr Pro Val Ala 

180 185 190 

Tyr Leu Thr Val He Lys Val Gly Leu Asn Arg Glu Met Val He Lys 

195 200 205 



Asp Lys Thr Gly Pro He Thr Thr Ser Tyr Pro He Val Glu Lys Val 
225 230 235 240 

Tyr Pro Leu 

(2) INFORMATION FOR SEQ ID NO:203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 307 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203: 

Lys Gin He Glu Val Val Asp Lys Asp Asn Lys Ser Glu Thr Ala Glu 
1 5 10 15 

Ala Ala Ser Val Thr Thr Asn Leu Val Thr Gin Ser Lys Val Ser Ala 
20 25 30 

Val Val Gly Pro Ala Thr Ser Gly Ala Thr Ala Ala Ala Val Ala Asn 
35 40 45 

Ala Thr Lys Ala Gly Val Pro Leu He Ser Pro Ser Ala Thr Gin Asp 
50 55 60 
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Gly Leu Thr Lys Gly Gin Asp Tyr Leu Phe lie Gly Thr Phe Gin Asp 

65 70 75 80 

Ser Phe Gin Gly Lys He He Ser Asn Tyr Val Ser Glu Lys Leu Asn 
85 90 95 

Ala Lys Lys Val " 
100 

Gly He Ala Lys Ser Phe Arg Glu Ser Tyr Lys Gly Glu He Val Ala 
115 120 125 



Gin Ala Thr Ala Glu Lys Ala Ser Asn He Tyr Phe He Ser Gly Phe 
195 200 205 



Val Thr Gly Gin Thr Ser Phe Asp Ala Asp His Asn Thr Val Lys Thr 
275 280 285 

Ala Tyr Met Met Thr Met Asn Asn Gly Lys Val Glu Ala Ala Glu Val 



50 (2) INFORMATION FOR SEQ ID NO: 2 04: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 amino acids 

(B) TYPE: amino acid 

55 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



60 (iii) HYPOTHETICAL: NO 
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(IV) ANT I- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Ket Leu Gin Gin Leu Val Asr. Gly Leu He Leu Gly Ser Val Tyr Ala 
15 10 15 

Leu Leu Ala Leu Gly Tyr Thr Met Val Tyr Gly He He Lys Leu He 
20 25 30 

Asn Phe Ala His Gly Asp He Tyr Met Met Gly Ala Phe He Gly Tyr 
35 40 45 

Phe Leu He Asn Ser Phe Gin Met Asn Phe Phe Val Ala Leu He Val 
50 55 60 

Ala Met Leu Ala Thr Ala He Leu Gly Val Val lie Glu Phe Leu Ala 
65 70 75 80 

Tyr Arg Pro Leu Arg His Ser Thr Arg He Ala Val Leu He Thr Ala 
85 90 95 



Ala Asn Thr Arg Ala Phe Pro Gin Ala He Gin Thr Val Arg Tyr Asp 
115 120 125 

Leu Gly Pro He Ser Leu Thr Asn Val Gin Leu Met He Leu Gly He 
130 135 140 



Met Gly Lys Ala Met Arg Ala Val Ser Val Asp Ser Asp Ala Ala Gin 
165 170 175 



Gly Ser Ala Leu Ala Gly Ala Ala Gly Val Leu He Ala Leu Tyr Tyr 
195 200 ' 205 

Asn Ser Leu Glu Pro Leu Met Gly Val Thr Pro Gly Leu Lys Ser Phe 
210 215 220 



Gly Gly Phe Val He Gly Leu Leu Glu Thr Phe Ala Thr Ala Phe Gly 
245 250 255 



Leu He Val Arg Pro Ala Gly He Leu Gly Lys Asn Val Lys Glu Lys 
275 280 285 
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(2) INFORMATION FOR SEQ ID NO:205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO-.205: 

Ser Gin Asp Gin Thr Trp Tyr Ala Leu Ala Tyr Asp Gly Ala Glu Val 
15 10 15 

He Gly Phe Leu Thr Val Gin Glu Thr Leu Phe Glu Ala Glu Val Leu 
20 25 30 

Gin He Ala Val Lys Gly Ala Tyr Gin Gly Gin Gly He Ala Ser Ala 
35 40 45 

Leu Phe Ala Gin Leu Pro Thr Asp Lys Glu He Phe Leu Glu Val Arg 
50 55 60 

Gin Ser Asn Gin Arg Ala Gin Ala Phe Tyr Lys Lys Glu Lys Met Ala 
65 70 75 ' 80 

Val He Ala Glu Arg Lys Ala Tyr Tyr His Asp Pro Val Glu Asp Ala 
85 90 95 



(2) INFORMATION FOR SEQ ID NO: 2 06: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:206: 

Lys Thr Leu Lys Gly His Gly Gin Phe Leu His Ala Lys Thr Leu Gly 
15 10 15 

Phe Thr His Pro Arg Thr Gly Lys Thr Leu Glu Phe Lys Ala Asp He 
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Pro Glu lie Phe Lys Glu Thr Leu Glu Arg Leu Arg Lys 
35 40 45 

5 (2) INFORMATION FOR SEQ ID NO: 2 07: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

15 (ili) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

20 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO-.207: 

Arg Glu Met Val Val His Pro Ser Ala Gly His Thr Ser Gly Thr Leu 
15 10 15 

25 Val Asn Ala Leu Met Tyr His lie Lys Asp Leu Ser Gly lie Asn Gly 

20 25 30 

Val Leu Arg Pro Gly He Val His Arg He Asp Lys Asp Thr Ser Gly 
35 40 45 

30 

Leu Leu Met He Ala Lys Asn Asp Asp Ala His Leu Val Leu Ala Gin 
50 55 60 

Glu Leu Lys Asp Lys Lys Ser Leu Arg Lys Tyr Trp Ala He Val His 
35 65 ' 70 75 80 

Gly Asn Leu Pro Asn Asp Arg Gly Val He Glu Ala Pro He Gly Arg 
85 90 95 

4 0 Ser Glu Lys Asp Arg Lys Lys Gin Ala Val Thr Ala Lys Gly Lys Pro 

100 105 110 

Ala Val Thr Arg Phe His Val Leu Glu Arg Phe Gly Asp Tyr Ser Leu 
115 120 125 

45 

Val Glu Leu Gin Leu Glu Thr Gly Arg Thr His Gin He Arg Val His 
130 135 140 

Met Ala Tyr He Gly His Pro Val Ala Gly Asp Glu Val Tyr Gly Pro 
50 145 ' 150 155 160 



55 (2) INFORMATION FOR SEQ ID NO:20B: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 224 amino acids 

(B) TYPE: amino acid 

6 0 (C) STRANDEDNESS: not relevai 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(IV) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 208: 

Leu Gly Thr Arg Gly Ser Ser Arg Val Asp Asn He Asn Leu Gin Val 
1 5 10 15 

Asp Glu Arg Asp Arg He Ala Leu Val Gly Lys Asn Gly Ala Gly Lys 
20 25 30 

Ser Thr Leu Leu Lys He Leu Val Gly Glu Glu Glu Pro Thr Ser Glv 
35 40 45 

Glu He Asn Lys Lys Lys Asp He Ser Leu Ser Tyr Leu Ala Gin Asp 
50 55 60 

Ser Arg Phe Glu Ser Glu Asn Thr He Tyr Asp Glu Met Leu His Val 
65 70 75 80 

Phe Asn Asp Leu Arg Arg Thr Glu Arg Gin Leu Arg Gin Met Glu Leu 
85 90 95 

Glu Met Gly Glu Lys Ser Gly Glu Asp Leu Asp Lys Leu Met Ser Asp 
100 105 HO 

Tyr Asp Arg Leu Ser Glu Asn Phe Arg Gin Ala Gly Gly Phe Thr Tyr 
H5 120 125 

Glu Ala Asp He Arg Ala He Leu Asn Gly Phe Lys Phe Asp Glu Ser 
130 135 140 

Met Trp Gin Met Lys He Ala Glu Leu Ser Gly Gly Gin Asn Thr Arg 
145 150 155 160 

Leu Ala Leu Ala Lys Met Leu Leu Glu Lys Pro Asn Leu Leu Val Leu 
165 170 175 

Asp Glu Pro Thr Asn His Leu Asp He Glu Thr He Ala Trp Leu Glu 
180 185 190 

Asn Tyr Leu Val Asn Tyr Ser Gly Ala Leu He He Val Ser His Asp 
195 200 205 

Arg Tyr Phe Leu Asp Lys Val Ala Thr He Thr Leu Asp Leu Thr Ser 
210 215 220 

(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 
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!ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
iiv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:209: 

Ser Thr Thr His His Leu Leu Val Lys Lys Val Asn Gly Leu Leu Val 
1 5 10 15 

Arg Trp Lys Asn Ala Cys Arg Gin Asn Cys Lys Gin Thr Phe Xaa Phe 
20 25 30 

Val Leu Thr Gin Leu He His Ala Asp Lys Trp Thr Val Ser Gly Arg 
35 40 45 

Gly Glu Leu His Leu Ser He Leu He Glu Thr Met Arg Arg Glu Gly 
50 55 60 

Tyr Glu Leu Gin Val Ser Arg Pro Glu Val He Val Lys Glu He Asp 
65 70 75 80 

Gly Val Lys Cys Glu Pro Phe Glu Arg Val Gin He Asp Thr Pro Glu 
85 90 95 

Glu Tyr Gin Gly Ser Val He Gin Ser Leu Ser Glu Arg Lys Gly Glu 
100 105 HO 

Met Leu Asp Met He Ser Thr Gly Asn Gly Gin Thr Arg Leu Val Phe 
115 120 125 

Leu Val Pro Ala Arg Gly Leu Xaa Trp He Leu Asn Val Leu Val Asn 
130 135 140 

Asp Ser Trp Leu Arg Tyr His Glu Pro Tyr Leu Arg Pro He Leu Ala 
145 150 155 160 

He Asp Ser Arg Gly Asn Trp Trp Thr Ser Pro Trp Cys Pro Cys Phe 
165 170 175 

Tyr Arg Cys Trp Gly Tyr Asn Leu Leu Asn Leu Leu Leu Ser Thr Leu 
180 185 190 

(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



WO 98/26072 



PCT/US97/22578 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 

Met Phe Gly Phe Phe Lys Lys Asp Lys Ala Val Glu Val Glu Val Pro 
15 10 15 

Thr Gin Val Pro Ala His He Gly He He Met Asp Gly Asr. Gly Arg 
20 25 30 

Trp Ala Lys Lys Arg Met Gin Pro Arg Val Phe Gly His Lys Ala Gly 
35 40 45 

Met Glu Ala Leu Gin Thr Val Thr Lys Ala Ala Asn Lys Leu Gly Val 
50 55 60 

Lys Val He Thr Val Tyr Ala Phe Ser Thr Glu Asn Trp Thr Arg Pro 
65 70 75 * 80 

Asp Gin Glu Val Lys Phe Xaa Met Asn Leu Pro Val Glu Phe Tyr Asp 
85 90 95 

Asn Tyr Val Pro Glu Leu His Ala Asn Asn Val Lys He Gin Met He 
100 105 HO 

Gly Glu Thr Asp Arg Leu Pro Lys Gin Thr Phe Glu Ala Leu Thr Lys 
115 120 125 

Ala Glu Glu Leu Thr Lys Asn Asn Thr Gly Leu He Leu Asn Phe Ala 
130 135 140 

Leu Asn Tyr Gly Gly Arg Ala Glu He Thr Gin Ala Leu Lys Leu He 
145 150 155 160 

Ser Gin Asp Val Leu Asp Ala Lys He Asn Pro Gly Asp He Thr Glu 
165 170 175 

Glu Leu He Gly Asn Tyr Leu Phe Thr Gin His Leu Pro Lys Asp Leu 
180 185 190 

Arg Asp Pro Asp Leu He He Arg Thr Ser Gly Glu Leu Arg Leu Ser 
195 200 205 

Asn Phe Leu Pro Trp Gin Gly Ala Tyr Ser Glu Leu Tyr Phe Thr Asp 
210 215 220 

Thr Leu Trp Pro Asp Phe Asp Glu Ala Ala Leu Gin Glu Ala He Leu 
225 230 235 240 

Ala Tyr Asn Arg Arg His Arg Arg Phe Gly Gly Val 
245 250 

(2) INFORMATION FOR SEQ ID NO:211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:211: 

Val Glu Gin Lys Leu Arg Gly Arg Asn Glu Asn Glu lie Gin Ser Glu 
15 10 15 

Asp He Gly Ser Leu Val Met Glu Glu Leu Ala Glu Leu Asp Glu He 
20 25 30 

Thr Tyr Val Arg Phe Ala Ser Val Tyr Arg Ser Phe Lys Asp Val Ser 
35 40 45 

Glu Leu Glu Ser Leu Leu Gin Gin He Thr Gin Ser Ser Lys Lys Lys 
50 55 60 

Lys Glu Arg 
65 

(2) INFORMATION FOR SEQ ID NO:212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 

Val Asp Ser Arg Gin Ala Glu Glu Gly Asn Thr He Arg Arg Arg Arg 
15 10 15 

Glu Cys Asp Glu Cys Gin His Arg Phe Thr Thr Tyr Glu Arg Val Glu 
20 25 30 

Glu Arg Thr Leu Val Val Val Lys Lys Asp Gly Thr Arg Glu Gin Phe 
35 40 45 

Ser Arg Asp Lys He Phe Asn Gly He He Arg Ser Ala Gin Lys Arg 
50 55 60 

Pro Val Ser Ser Asp Glu He Asn Met Val He 
65 70 75 

(2) INFORMATION FOR SEQ ID NO:213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 



WO 98/26072 



PCTYUS97/22578 



(B) TYPE : amino acid 

(C) ST HANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(11) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(IV) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213: 

Phe Ala Gin Val Pro Lys Val Ala Gin Lys Val Met Lys Val Thr Lys 
15 10 15 

Ala Ala Gly Met Asn He He Ser Asn Cys Glu Glu Val Ala Gly Gin 
20 25 30 

Thr Val Phe His Thr His Val His Leu Val Pro Arg Tyr Ser Ala Asp 
35 4 0 4 5 

Asp Asp Leu Lys He Asp Phe He Ala His Glu Thr Asp Phe Asp 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:214: 

Met Ser Asp Cys He Phe Cys Lys He He Ala Gly Glu He Pro Ala 
1 5 10 15 

Ser Lys Val Tyr Glu Asp Glu Gin Val Leu Ala Phe Leu Asp He Ser 
20 25 30 

Gin Val Thr Leu Gly His Thr Leu Val Val Pro Lys Glu His Tyr Arg 
35 40 45 

Asr. Leu Leu Glu Met Asp Ala Thr Ser Ala Thr Asn Ser Leu Pro Lys 
50 55 60 

Tyr Gin Lys 
65 

(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 amino acids 



WO 98/26072 



PCT/US97/22578 



(Bj TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

[Dj TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:215: 

He Gin Ala Val Arg Asp Val Ser Phe Glu Val Asn Glu Gly Glu Val 
1 5 10 15 

Val Ser Leu He Gly Ala Asn Gly Ala Gly Lys Thr Thr He Leu Arg 
20 25 30 

Thr Leu Ser Gly Leu Val Arg Pro Ser Ser Gly Lys He Glu Phe Leu 
35 40 45 

Gly Gin Glu He Gin Lys Met Pro Ala Gin Lys He Val Ala Gly Gly 
50 55 60 

Leu Ser Gin Val Pro Glu Gly Arg His Val Phe Pro Gly Leu Thr Val 
65 70 75 ' 80 

Met Glu Asn Leu Glu Met Gly Ala Phe Leu Lys Lys Asn Arg Glu Glu 
85 90 " 95 

Asn Gin Ala Asn Leu Lys Lys Val Phe Ser Arg Phe Pro Arg Leu Glu 
100 105 ' HO 

Glu Arg Lys Asn Gin Asp Ala Ala Thr Leu Ser Gly Gly Glu Gin Gin 
115 120 125 

Met Leu Ala Met Gly Arg Ala Leu Met Ser Thr Pro Lys Leu Leu Leu 
130 135 140 

Leu Asp Glu Pro Ser Met Gly Leu Ala Pro He Phe He Gin Glu He 
145 150 155 160 

Phe Asp He He Gin Asp He Gin Lys Gin Gly Thr Thr Val Leu Leu 
165 170 175 

He Glu Gin Asn Ala Asn Lys Ala Leu Ala He Ser Asp Arg Gly Tyr 
180 165 190 

Val Leu Glu Gin Gly Asn Arg Leu Ser Gly Thr Gly Lys Asp Ser Leu 
195 200 205 

He Arg Gly Val 
210 

(2) INFORMATION FOR SEQ ID NO:216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: not relevant 

(11) MOLECULE TYPE: peptide 

(ill) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-216: 

Leu Leu Ser Leu lie Asp lie Leu Val Asp Gly Arg Tyr Asp Arg Thr 
1 5 10 15 

Lys Arg Asn Leu Met Leu Gin Phe Arg Gly Ser Ser Asn Gin Arg He 
20 25 30 

He Asp Ser Arg Gly Ser Pro Gly Thr Glu Leu 
35 40 

(2) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

Met Asn Asn Pro Lys Pro Gin Glu Trp Lys Ser Glu Glu Leu Ser Gin 
1 5 10 15 

Gly Arg He He Asp Tyr Lys Ala Phe Asn Phe Val Asp Gly Glu Gly 
20 25 30 

Val Arg Asn Ser Leu Tyr Val Ser Gly Cys Met Phe His Cys Glu Gly 
35 40 45 

Cys Tyr Asn Val Ala Thr Trp Ser Phe Asn Ala Gly He Pro Tyr Thr 
50 55 60 

Ala Glu Leu Glu Glu Gin He Met Ala Asp Leu Ala Gin Pro Tyr Val 
65 70 75 80 

Gin Gly Leu Thr Leu Leu Gly Gly Glu Pro Phe Leu Asn Thr Gly He 
85 90 95 

Leu Leu Pro Leu Val Lys Arg He Arg Lys Glu Leu Pro Asp Lys Asp 
100 105 HO 

He Trp Ser Trp Thr Gly Tyr Thr Trp Glu Glu Met He Pro Gly Asn 
115 120 125 
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(2) INFORMATION FOR SEQ ID NO: 218: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
10 (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

15 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 

20 

Met Val Asn His Phe Arg He Asp Arg Val Gly Met Glu He Lys Arg 
15 10 15 

Glu Val Asn Glu He Leu Gin Lys Lys Val Arg Asp Pro Arg Val Gin 
25 20 25 ~ 30 

Gly Val Thr He Thr Asp Val Gin Met Leu Gly Asp Leu Ser Val Ala 
35 40 45 

30 Lys Val Tyr Tyr Thr He Leu Ser Asn Leu Ala Ser Asp Asn Gin Lys 

50 55 60 

Ala Gin He Gly Leu Glu Lys Ala Thr Gly Thr He Lys Arg Glu Leu 
35 65 70 75 80 

Gly Arg Asn Leu Lys Leu Tyr Xaa He Pro Asp Leu Thr Phe Val Lys 
85 90 95 

Asp Glu Ser He Glu Xaa Gly Thr Lys He Asp Glu Met Leu Arg Asn 
40 100 105 HO 

Leu Asp Lys Thr Lys Glu Glu Gly Val Ala Pro Leu Phe Trp 
115 120 125 

45 (2) INFORMATION FOR SEQ ID NO:219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 67 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:219: 
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-308- 

Phe His His Val Thr Val Leu Leu His Glu Thr He Asp Met Leu Asp 
1 5 10 15 

Val Lys Pro Glu Gly He Tyr Val Asp Ala Thr Leu Gly Gly Ala Gly 
20 25 30 

His Ser Glu Tyr Leu Leu Ser Lys Leu Ser Glu Lys Gly His Leu Tyr 
35 40 45 

Ala Phe Asp Gin Asp Gin Asn Ala He Asp Asn Ala Gin Lys Arg Leu 
50 55 60 

Ala Pro Tyr He Glu Lys Gly Met Val Thr Phe He Lys Asp Asn Phe 
65 70 75 80 

Arg His Leu Gin Ala Arg Leu Arg Glu Ala Gly Val Gin Glu He Asp 
85 90 95 

Gly He Cys Tyr Asp Leu Gly Val Ser : 
100 ' 105 

Glu Arg Gly Phe Ser Tyr Lys Lys Asp Ala Pro Leu Asp Met Arg Met 
115 120 125 

Asn Gin Asp Ala Ser Leu Thr Ala Tyr Glu Val Val Asn His Tyr Asp 
130 135 140 

Tyr His Asp Leu Val Arg He Phe Phe Lys Tyr Gly Glu Asp Lys Phe 
145 150 155 160 

Ser Lys Gin He Ala Arg Lys He Glu Gin Ala Arg Glu Val Lys Pro 
165 170 175 

He Glu Thr Thr Thr Glu Leu Ala Glu He He Lys Leu Val Lys Pro 
180 185 190 

Ala Lys Glu Leu Lys Lys Lys Gly His Pro Ala Lys Gin He Phe Gin 
195 200 205 

Ala He Arg He Glu Val Asn Asp Glu Leu Gly Ala Ala Asp Glu Ser 
210 215 220 

He Gin Gin Ala Met Asp Met Leu Ala Leu Asp Gly Arg He Ser Val 
225 2 30 235 240 

He Thr Phe His Ser Leu Glu Asp Arg Leu Thr Lys Gin Leu Phe Lys 
245 250 255 

Xaa Ala Ser Thr Val Glu Val Pro Lys Gly Leu 
260 265 

(2) INFORMATION FOR SEQ ID NO: 220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 165 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

Leu Met His Val Thr Val Gly Glu Leu He Gly Asn Phe He Leu He 
1 5 10 15 

Thr Gly Ser Phe He Leu Leu Leu Val Leu He Lys Lys Phe Ala Trp 
20 25 30 

Ser Asn He Thr Gly He Phe Glu Glu Arg Ala Glu Lys He Ala Ser 
35 40 45 

Asp He Asp Arg Ala Glu Glu Ala Arg Gin Lys Ala Glu Val Leu Ala 
50 55 60 

Gin Lys Arg Glu Asp Glu Leu Ala Gly Ser Arg Lys Glu Ala Lys Thr 
65 70 75 ~ 80 

He He Glu Asn Ala Lys Glu Thr Ala Glu Gin Ser Lys Ala Asn He 
85 90 95 

Leu Ala Asp Ala Lys Leu Glu Ala Gly His Leu Lys Glu Lys Ala Asn 
100 105 " no 

Gin Glu He Ala Gin Asn Lys Val Glu Ala Leu Gin Ser Val Lys Gly 
115 120 125 

Glu Val Ala Asp Leu Thr He Ser Leu Ala Gly Lys He He Ser Gin 
130 135 140 

Asn Leu Asp Ser His Ala His Lys Ala Leu He Asp Gin Tyr lie Asp 
145 150 155 160 

Gin Leu Gly Glu Ala 
165 

(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 629 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:221: 

Met Gin Arg Leu Val Ser Leu Leu He Trp Ser Leu Leu Glu Thr Ser 
15 10 15 

He Leu Ser He His Gly Leu Gly Pro Leu Thr Lys Arg Phe Gly Val 
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Ala Leu Glu His His His Met Ala Asn Tyr Asp Ala Glu Ala Thr Gly 

35 40 45 

Arg Leu Leu Phe lie Phe He Lys Glu Val Ala Glu Lys His Gly Val 

50 55 60 

Thr Asp Leu Ala Arg Leu Asn He Asp Leu He Ser Pro Asp Ser Tyr 

65 70 75 80 

Lys Lys Ala Arg He Lys His Ala Thr He Tyr Val Lys Asn Gin Val 



Gly Leu Lys Asn He Phe Lys Leu Val Ser Leu Ser Asn Thr Lys Tyr 
100 105 no 

Phe Glu Gly Val Ser Arg He Pro Arg Thr Val Leu Asp Ala His Arg 
115 120 125 

Glu Gly Leu He Leu Gly Ser Ala Cys Ser Glu Gly Glu Val Phe Asp 
130 135 140 

Val Val Val Ser Gin Gly Val Asp Ala Ala Val Glu Val Ala Lys Tyr 
145 150 155 160 

Tyr Asp Phe He Glu Val Met Pro Pro Ala He Tyr Ala Pro Leu He 
165 170 * 175 

Ala Lys Glu Gin Val Lys Asp Met Glu Glu Leu Gin Thr He He Lys 
180 185 190 

Ser Leu He Glu Val Gly Asp Arg Leu Gly Lys Pro Val Leu Ala Thr 
195 200 ' 205 

Gly Asn Val His Tyr He Glu Pro Glu Glu Glu He Tyr Arg Glu He 
210 215 220 

He Val Arg Ser Leu Gly Gin Gly Ala Met He Asn Arg Thr He Gly 
225 230 235 240 

His Gly Glu His Ala Gin Pro Ala Pro Leu Pro Lys Ala His Phe Arg 
245 250 255 

Thr Thr Asn Glu Met Leu Asp Glu Phe Ala Phe Leu Gly Glu Glu Leu 
260 265 270 

Ala Arg Lys Leu Val He Glu Asn Thr Asn Ala Leu Ala Glu He Phe 
275 280 285 

Glu Pro Val Glu Val Val Lys Gly Asp Leu Tyr Thr Pro Phe He Asp 
290 295 300 

Lys Ala Glu Glu Thr Val Ala Glu Leu Thr Tvr Lys Lys Ala Phe Glu 
305 310 315 320 

He Tyr Gly Asn Pro Leu Pro Asp He Val Asp Leu Arg lie Glu Lys 
325 330 335 

Glu Leu Thr Ser He Leu Glv Asn Gly Phe Ala Val He Tyr Leu Ala 
340 345 350 
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Ser Gin Met Leu Val Gin Arg Ser Asn Glu Arg Gly Tvr Leu Val Gly 
355 360 365 

Ser Arg Gly Ser Val Gly Ser Ser Phe Val Ala Thr Met He Gly He 
370 375 380 

Thr Glu Val Asn Pro Leu Ser Pro His Tyr Val Cys Gly Gin Cvs Gin 
385 390 395 ' ' 400 

Tyr Ser Glu Phe He Thr Asp Gly Ser Tyr Gly Ser Gly Phe Asp Met 
405 410 ~ 415 

Pro His Lys Asp Cys Pro Asn Cys Gly His Lys Leu Ser Lys Asn Gly 
420 425 430 

Gin Asp He Pro Phe Glu Thr Phe Leu Gly Phe Asp Gly Asp Lys Val 
435 440 445 

Pro Asp He Asp Leu Asn Phe Ser Gly Glu Asp Gin Pro Ser Ala His 
450 455 460 

Leu Asp Val Arg Asp He Phe Gly Glu Glu Tyr Ala Phe Arg Ala Gly 
465 470 475 480 

Thr Val Gly Thr Val Ala Ala Lys Thr Ala Tyr Gly Phe Val Lys Gly 
485 490 495 

Tyr Glu Arg Asp Tyr Gly Lys Phe Tyr Arg Asp Ala Glu Val Glu Arg 
500 505 510 

Leu Ala Gin Gly Ala Ala Gly Val Lys Arg Thr Thr Gly Gin His Pro 
515 520 525 

Gly Gly He Val Val He Pro Asn Tyr Met Asp Val Tyr Asp Phe Thr 
530 535 540 

Pro Val Gin Tyr Pro Ala Asp Asp Val Thr Ala Glu Trp Gin Thr Thr 
545 550 555 ' 560 

His Phe Asn Phe His Asp He Asp Glu Asn Val Leu Lys Leu Asp Val 
565 570 575 

Leu Gly His Asp Asp Pro Thr Met He Arg Lys Leu Gin Asp Leu Ser 
580 585 590 

Gly He Asp Pro Asn Lys He Pro Met Asp Asp Glu Gly Val Met Ala 

595 600 605 

Leu Phe Ser Gly Thr Asp Val Leu Gly Val Thr Pro Glu Gin He Gly 
610 615 620 



(2) INFORMATION FOR SEQ ID NO:222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 693 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not releva: 
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(D) TOPOLOGY: not relevant 
MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 

Met Ala Arg Glu Phe Ser Leu Glu Lys Thr Arg Asn He Gly He Met 
1 5 10 15 

Ala His Val Asp Ala Gly Lys Thr Thr Thr Thr Glu Arg He Leu Tvr 
20 25 30 

Tyr Thr Gly Lys He His Lys He Gly Glu Thr His Glu Gly Ala Ser 
35 40 45 

Gin Met Asp Trp Met Glu Gin Glu Gin Glu Arg Gly He Thr He Thr 
50 55 6 o 

Ser Ala Ala Thr Thr Ala Gin Trp Asn Asn His Arg Val Asn He He 
65 70 75 80 

Asp Thr Pro Gly His Val Asp Phe Thr He Glu Val Gin Arg Ser Leu 
85 90 95 

Arg Val Leu Asp Gly Ala Val Thr Val Leu Asp Ser Gin Ser Gly Val 
100 105 no 

Glu Pro Gin Thr Glu Thr Val Trp Arg Gin Ala Thr Glu Tyr Gly Val 
115 120 125 

Pro Arg He Val Phe Ala Asn Lys Met Asp Lys He Gly Ala Asp Phe 
130 135 140 

Leu Tyr Ser Val Ser Thr Leu His Asp Arg Leu Gin Ala Asn Ala His 
145 150 155 160 

Pro He Gin Leu Pro He Gly Ser Glu Asp Asp Phe Arg Gly He He 
165 170 175 

Asp Leu He Lys Met Lys Ala Glu He Tyr Thr Asn Asp Leu Gly Thr 
180 185 190 

Asp He Leu Glu Glu Asp He Pro Ala Glu Tyr Leu Asp Gin Ala Gin 
195 200 205 

Glu Tyr Arg Glu Lys Leu He Glu Ala Val Ala Glu Thr Asp Glu Glu 
210 215 220 

Leu Met Met Lys Tyr Leu Glu Gly Glu Glu lie Thr Asn Glu Glu Leu 
225 230 235 240 

Lys Ala Gly lie Arg Lys Ala Thr He Asn Val Glu Phe Phe Pro Val 
245 250 255 
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Leu Cys Gly Ser 
260 



Ala Val He Asp 
275 



Ala Phe Lys . 
Tyr Leu Pro 



•> Asp Thr Asp 
295 



Val Gly Arg Leu ' 



Ala Glu 
Ala Phe 
Arg Val 



Gly Val 
Leu Asp 
Glu He 



Gin Leu Met Leu 

270 



Asp 
Lys 



Lys He Net Thr Asp Pro Phe 
315 320 

Ser 



Gly Ser Tyr Val 
340 



Leu Asn Thr 



l Met His Ala 



> He Ala Ala 
375 



Gly Asp Ser Leu 

385 



Asn Val Pro Glu Pro Val He 
4 05 



Ala Asp Gin Asp 
420 



Lys Met Gly 
Arg Val Glu 



: Gly Glu Leu 
455 



Tyr Arg Glu Thr 



Ala Val 
Lys Ala 
Gin Leu 



He Ala 
425 



His Leu 
Ala Asn 
Ser Thr 



Tyr Ser 
330 

Gly Lys 

Arg Gin 

Gly Leu 

Lys He 
395 

Met Val 
410 

Leu Gin 
Val Glu 
Asp Val 
Val Gly 



Gly Val Leu Gin 
335 



Arg Glu Arg He 
350 



Glu lie Asp Thr 
365 



He Leu Glu Ser 



Glu Pro Lys Ser 
415 



Lys Leu Ala Gil 
430 



Gly 
Val 



Lys 
Glu 



475 



Ala Pro Gin Val Ser 



Arg Gin Ser Gly 
500 



Thr Pro Asn Gil 
515 



Gly Lys Gly 
Glu Gly Lys 



Gin Phe 
505 



Pro Arg Glu 
535 



Val Lys Ala Lys 
Glu Thr Ala Phe 



Phe He 
Val Leu 
Gly Ser 
Ala Ser 



Gin Ala 
490 

Gly Asp 
Glu Phe 
Pro Ala 



Ala Gly 
555 



Arg Gly Phe Phe 
495 



Val Trp He Glu 
510 



Tyr Pro Met Val 



i Asp Val Asp Ser . 

575 



Leu Ser Leu Lys Glu Ala Ala 
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Lys Ser Ala Gin Pro Ala lie Leu Glu Pro Met Met Leu Val Thr lie 
595 600 605 

Thr Val Pro Glu Glu Asn Leu Gly Asp Val Met Gly His Val Thr Ala 
61° 615 620 

Arg Arg Gly Arg Val Asp Gly Met Glu Ala His Gly Asn Ser Gin He 
625 630 635 640 

Val Arg Ala Tyr Val Pro Leu Ala Glu Met Phe Gly Tyr Ala Thr Val 
645 650 ' 655 

Leu Arg Ser Ala Ser Gin Gly Arg Gly Thr Phe Met Met Val Phe Asp 
660 665 670 

His Tyr Glu Asp Val Pro Lys Ser Val Gin Glu Glu He He Lys Lys 
675 680 685 



(2) INFORMATION FOR SEQ ID NO:223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 274 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

Ala Tyr Lys Gly His Gin Glu Tyr Val Leu Pro Gin Ala Ala Arg Lys 
15 10 15 

He Tyr Ala Tyr Arg Arg Tyr Asp Leu Asn Glu Ser Pro Lys Thr Ala 
20 25 30 

Leu Asp Leu He He Pro Asp Leu Phe Leu His He Leu Asn Pro Ala 
35 40 45 

Glu Arg Glu Arg Lys Leu Lys Arg Glu Gly Val Glu Glu Leu Tyr Leu 
50 55 60 

Leu Asp Phe Ser Ser Gin Phe Ala Ser Leu Thr Ala Gin Glu Phe Phe 
65 ™ 75 80 

Ala Thr Tyr He Lys Ala Met Asn Ala Lys He He Val Ala Gly Phe 

85 90 95 

Asp Tyr Thr Phe Gly Ser Asp Lys Lys Thr Ala Glu Asp Leu Lys Asp 
100 105 no 
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Tyr Phe Asp Gly Glu Val lie lie Val Pro Pro Val Glu Asp Glu Lys 
U5 120 125 

Gly Lys He Ser Ser Thr Arg He Arg Gin Ala He Leu Asp Gly Asn 
130 135 140 

Val Lys Glu Ala Gly Lys Leu Leu Gly Ala Pro Leu Pro Ser Arg Gly 
145 150 155 160 

Met Val Val His Gly Asn Ala Arg Gly Arg Thr He Gly Tyr Pro Thr 
165 170 175 

Ala Asn Leu Val Leu Leu Asp Arg Thr Tyr Met Pro Ala Asp Gly Val 
180 185 190 

Tyr Val Val Asp Val Glu He Gin Arg Gin Lys Tyr Arg Ala Met Ala 
195 200 205 

Ser Val Gly Lys Asn Val Thr Phe Asp Gly Glu Glu Ala Arg Phe Glu 
210 215 220 

Val Asn He Phe Asp Phe Asn Gin Asp He Tyr Gly Glu Thr Val Met 
225 230 235 240 

Val Tyr Trp Leu Asp Arg He Arg Asp Met Thr Lys Phe Asp Ser Val 
2 ^5 250 255 

Asp Gin Leu Val Asp Gin Leu Lys Ala Asp Glu Glu Val Thr Arg Asn 



(2) INFORMATION FOR SEQ ID NO:224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

Leu Arg Lys Glu Pro Ser Met Ala Lys Gly Glu Gly Lys Val Val Ala 
1 5 10 15 

Gin Asn Lys Lys Ala Arg His Asp Tyr Thr He Val Asp Thr Leu Glu 
20 25 30 

Ala Gly Met Val Leu Thr Gly Thr Glu He Lys Ser Val Arg Ala Ala 
35 40 45 

Arg He Asn Leu Lys Asp Gly Phe Ala Gin Val Lys Asn Gly Glu Val 
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Trp Leu Ser Asn Val His lie Ala Pro Tyr Glu Glu Gly Asn He Trp 
6 5 7 0 75 8 0 

Asn Gin Glu Pro Glu Arg Arg Arg Lys Leu Leu Leu His Lys Lys Gin 
85 90 95 

He Gin Lys Leu Glu Gin Glu Thr Lys Gly Thr Gly Met Thr Leu Val 
100 105 HO 

Pro Leu Lys Val Tyr Met Ala Thr Leu Ser Phe Phe 
115 120 

(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:225: 

He Val Lys Glu Glu Lys Gly Leu Lys Glu Lys Gin Phe Trp Asn Arg 
15 10 15 

He Leu Glu Phe Ala Gin Glu Arg Leu Thr Arg Ser Met Tyr Asp Phe 
20 25 30 

Tyr Ala He Gin Ala Glu Leu He Lys Val Glu Glu Asn Val Ala Thr 
35 40 45 

He Phe Leu Pro Arg Ser Glu Met Glu Met Val Trp Glu Lys Gin Leu 
50 55 60 

Lys Asp He He Val Val Ala Gly Phe Glu He Tyr Asp Ala Glu He 
65 70 75 80 

Thr Pro His Tyr He Phe Thr Lys Pro Gin Asp Thr Thr Ser Ser Gin 
85 90 95 

Val Glu Glu Ala Thr Asn Leu Thr Leu Tyr Asp Tyr Ser Pro Lys Leu 
100 105 " HO 

Val Ser He Pro Tyr Ser Asp Thr Gly Leu Lys Glu Lys Tyr Thr Phe 
115 120 125 

Asp Asn Phe He Gin Gly Asp Gly Asn Val Trp Ala Val Ser Ala Ala 
130 135 140 

Leu Ala Val Ser Glu Asp Leu Ala Leu Thr Tyr Asn Pro Leu Phe He 
145 150 155 160 
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Tyr Gly Gly Pro Gly Leu Gly Lys Thr His Leu Leu Asn Ala He Gly 
165 170 175 

Asn Glu He Leu Lys Asn He Pro Asn Ala Arg Val Lys Tvr He Pro 
180 185 130 

Ala Glu Ser Phe He Asn Asp Phe Leu Asp His Leu Arg Leu Gly Glu 
195 200 205 

Met Glu Lys Phe Lys Lys Thr Tyr Arg Ser Leu Asp Leu Leu Leu He 
210 215 220 

Asp Asp He Gin Ser Leu Ser Gly Lys Lys Val Ala Thr Gin Glu Glu 
225 230 235 240 

Phe Phe Asn Thr Phe Asn Ala Leu His Asp Lys Gin Lys Gin He Val 
245 250 255 

Leu Thr Ser Asp Arg Ser Pro Lys His Leu Glu Gly Leu Glu Glu Arg 
260 265 270 

Leu Val Thr Arg Phe Ser Trp Gly Leu Thr Gin Thr He Thr Pro Pro 
275 280 285 

Asp Phe Glu Thr Arg lie Ala He Leu Gin Ser Lys Thr Glu His Leu 
290 295 300 

Gly Tyr Asn Phe Gin Ser Asp Thr Leu Glu Tyr Leu Ala Gly Gin Phe 
305 310 315 320 

Asp Ser Asn Val Arg Asp Leu Glu Gly Ala He Asn Asp He Thr Leu 
325 330 335 

lie Ala Arg Val Lys Lys He Lys Asp He Thr He Asp He Ala Ala 
340 345 350 

Glu Ala He Arg Ala Arg Lys Gin Asp Val Ser Gin Met Leu Val He 
355 360 365 

Pro lie Asp Lys He Gin Thr Glu Val Gly Asn Phe Tyr Gly Val Ser 
370 375 380 

He Lys Glu Met Lys Gly Ser Arg Arg Leu Gin Asn He Val Leu Ala 
385 390 395 400 

Arg Gin Val Ala Met Tyr Leu Ser Arg Glu Leu Thr Asp Asn Ser Leu 
405 410 415 

Pro Lys lie Gly Lys Glu Leu Gly Glu Lys Ser Tvr His Ser His Ser 
420 425 430 

Cys Pro Cys Gin Asn Lys He Leu Asn 
435 440 

(2) INFORMATION FOR SEQ ID NO:226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 201 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226: 

Glu Leu Val Ser Thr Met Tyr Phe Arg Phe Asp Tyr Tyr Ser Gin Asn 
1 5 10 15 

Leu Gly Glu He Phe Ala He Gly Met Val Val Gly His Leu Arg Trp 
20 25 30 

Leu He Thr Gly Ala Leu Val Leu Tyr He Phe Ala Asp Arg Lys Leu 
35 40 45 

He Asn Thr Trp Asp Phe Leu Asp He Ala Ala Pro Ser Val Met He 
50 55 60 

Ala Gin Ser Leu Gly Arg Trp Gly Asn Phe Phe Asn Gin Glu Ala Tyr 
65 70 75 80 

Gly Ala Thr Val Asp Asn Leu Asp Tyr Leu Pro Gly Phe lie Arg Asp 
85 90 ' 95 

Gin Met Tyr He Glu Gly Ser Tyr Arg Gin Pro Thr Phe Leu Tyr Glu 
100 105 HO 

Ser Leu Trp Asn Leu Leu Gly Phe Ala Leu He Leu He Phe Arg Arg 
H5 120 125 

Lys Trp Lys Ser Leu Arg Arg Gly His He Thr Ala Phe Tyr Leu He 
130 135 140 

Trp Tyr Gly Phe Gly Arg Met Val He Glu Gly Met Arg Thr Asp Ser 
14 5 150 155 160 

Leu Met Phe Phe Gly Leu Arg Val Ser Gin Trp Leu Ser Val Val Leu 
165 170 175 

He Gly Leu Gly He Met He Val He Tyr Gin Asn Arg Lys Lys Ala 
180 185 190 

Pro Tyr Tyr He Thr Glu Glu Glu Asn 
195 200 

(2) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 491 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:227: 

Leu Glu Asp Phe Pro Leu Ser Val Thr Asn Pro Tyr Gly Arg Thr Lys 
1 5 10 " 15 

Leu Met Leu Glu Glu He Leu Thr Asp He Tvr Lys Ala Asp Ser Glu 
20 25 30 

Trp Asn Val Val Leu Leu Arg Tyr Phe Asn Pro He Gly Val His Glu 
35 40 45 

Ser Gly Asp Leu Gly Glu Asn Pro Asn Gly He Pro Asn Asn Leu Leu 
50 55 60 

Pro Tyr Val Thr Gin Val Ala Val Gly Lys Leu Glu Gin Val Gin Val 
65 70 75 80 

Phe Gly Asp Asp Tyr Asp Thr Glu Asp Gly Thr Gly Val Arg Asp Tyr 
85 90 " 95 

He His Val Val Asp Leu Ala Lys Gly His Val Ala Ala Leu Lys Lys 
100 105 HO 

He Gin Lys Gly Ser Gly Leu Asn Val Tyr Asn Leu Gly Thr Gly Lys 
115 120 125 

Gly Tyr Ser Val Leu Glu He He Gin Asn Met Glu Lys Ala Val Gly 
130 135 140 

Cys Pro He Pro Tyr Arg He Val Glu Arg Arg Pro Gly Asp He Ala 
145 150 155 160 

Ala Cys Tyr Ser Asp Pro Ala Lys Ala Lys Ala Glu Leu Gly Trp Glu 
165 170 175 

Ala Glu Leu Asp He Thr Gin Met Cys Glu Gly His Gly Val Gly Arg 
180 185 190 

Ala Ser He Gin Met Asp Leu Lys Thr Lys Met Met He Ser He He 
195 200 205 

Val Pro Cys Leu Asn Glu Glu Glu Val Leu Pro Leu Phe Tyr Gin Ala 
210 215 220 

Leu Glu Ala Leu Leu Pro Asp Leu Glu Thr Glu He Glu Tyr Val Phe 
225 230 235 240 

Val Asp Asp Gly Ser Ser Asp Gly Thr Leu Glu Leu Leu Lvs Ala Tyr 
245 250 " 255 

Arg Glu Gin Asn Pro Ala Val His Tyr He Ser Phe Ser Arg Asn Phe 
260 265 270 

Gly Lys Glu Ala Ala Leu Tyr Ala Gly Leu Gin Tyr Ala Thr Glv Asp 
275 280 285 
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Leu Val Val Val Met Asp Ala Asp Leu Gin Asp Pro Pro Ser Met Leu 

290 295 300 

Phe Glu Met Lys Asn Val Leu Asp Lys Asn Val Asp Leu Asp Cys Val 
305 310 315 320 

Gly Thr Arg Arg Thr Ser Arg Glu Gly Glu Pro Phe Phe Arg Ser Phe 
325 330 335 

Cys Ala Val Leu Phe Tyr Arg Leu Met Gin Lys lie Ser Pro Val Ala 
340 345 350 

Leu Pro Ser Gly Val Arg Asp Phe Arg Met Met Arg Arg Ser Val Val 
355 360 365 

Asp Ala lie Leu Ser Leu Thr Glu Ser Asn Arg Phe Ser Lys Gly Leu 
370 375 380 

Phe Ala Trp Val Gly Phe Lys Thr His Tyr Leu Asp Tyr Pro Asn Val 
385 390 395 400 

Glu Arg Gin Ala Gly Lys Thr Ser Trp Ser Phe Arg Gin Leu Phe Phe 
405 410 415 

Tyr Ser He Glu Gly He Val Asn Phe Ser Asp Phe Pro Leu Thr He 
420 425 430 

Ala Phe Val Ala Gly Leu Leu Ser Cys Phe Leu Ser Leu Leu Met Thr 
435 440 445 

Phe Phe Val Val Val Arg Thr Leu He Leu Gly Asn Pro Thr Ser Gly 
450 455 460 

Trp Thr Ser Leu Met Ala Val He Leu Tyr Leu Gly Gly He Gin Leu 
465 470 475 480 

Leu Thr He Gly He Leu Gly Lys Tyr Asn Gin 
485 490 

(2) INFORMATION FOR SEQ ID NO: 228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:228: 

Val He He He Asp Asp Asn Tyr Ser Asn Val Asn Leu Arg Asn Lys 
1 5 10 15 

He He His Gin Phe Gly Tyr Thr Asn His Arg He Lys Leu He Leu 
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20 25 30 

Ser Asn Glu Asp Leu Gly Ala Thr Asn Ala Arg Asn lie Gly lie Lys 
35 40 45 

Asn Ser Arg Gly Lys Tyr He Ser Phe Leu Asp Asp Asp Asp Glu Tvr 
50 55 60 

Met Pro Asp Arg He Leu Lys Leu Met Ala Cys Phe Lys Lys Ser Arg 

65 7 0 75 80 

Met Lys Asn Leu Ala Leu Val Tyr Ser Tyr Gly He He He Tyr Pro 



Asn Gly Thr Arg Glu Glu Glu Lys Thr Asp Phe Val Gly Asn Pro Leu 
100 105 110 

Phe Val Gin Met Val His Asn He Ala Gly Thr Ser Phe Trp Leu Cys 
115 120 125 

Lys Lys Glu Val Leu Glu Leu He Asn Gly Phe Glu Lys He Asp Ser 
130 135 140 

His Gin Asp Gly Val Val Leu Leu Lys Leu Leu Ala Gin Gly Tyr Gin 
145 150 155 160 

He Asp He Val Arg Glu Phe Leu Val Asn Tyr Tyr Ala His Ser Lys 
165 170 175 

Glu Asn Gly He Thr Gly Val Thr Gin Lys Thr He Asn Ala Asp Glu 
180 185 190 

Glu Tyr Tyr Asn Tyr Cys Arg Lys Tyr Phe Asn Leu Leu Ser Phe Asn 
195 200 205 

Glu Arg He Leu Val Thr Lys Lys Tyr Tyr Ser Leu Asn He Lys Arg 
210 215 220 

Leu Leu Leu He Gly Asp Lys Cys Lys Ala Leu Lys Val He Lys Lys 
225 230 235 240 

Ala Arg Glu Glu Lys He Phe Asn Glu Phe Leu Phe Leu Lys Tyr Met 
245 250 255 

Leu Leu Tyr Arg Ser Phe Phe Tyr Cys He Tyr Asp Asn Tyr Val Gin 
260 265 270 

Leu Lys Phe Arg Lys 
275 



WO 98/26072 



PCT/US97/22578 



-322- 
CLAIMS 

1. An isolated nucleic acid compound comprising a 
sequence identical to or substantially identical to a 
sequence selected from the group consisting of SEQ ID NO:l 
through SEQ ID NO : 8 6 . 

2. An isolated nucleic acid compound comprising a 
sequence identical to or substantially identical to a 
sequence selected from the group consisting of SEQ ID NO: 87, 
SEQ ID NO:89, SEQ ID N0:91, SEQ ID NO:93, SEQ ID NO:95, SEQ 
ID NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID 
NO:105, SEQ ID NO:107, SEQ ID NO:109, SEQ ID NOrlll, SEQ ID 
NO: 113, SEQ ID NO:115, SEQ ID NO:117, SEQ ID NO:119, and SEQ 
ID NO: 121. 

3. A substantially purified protein or fragment 
thereof from S. pneumoniae wherein said protein is selected 
from the group consisting of SEQ ID NO: 88, SEQ ID NO: 90, SEQ 
ID NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID NO:98, SEQ ID 
NO:100, SEQ ID N0:102, SEQ ID NO:104, SEQ ID NO:106, SEQ ID 
NO:108, SEQ ID NO:110, SEQ ID NO:112, SEQ ID NO:114, SEQ ID 
NO:116, SEQ ID NO:118, SEQ ID NO:120, SEQ ID NO:122, and SEQ 
ID NO: 123 through SEQ ID NO: 228. 

4 . An isolated nucleic acid compound encoding any 
one of the proteins or fragments thereof of Claim 3. 

5. A vector comprising any one of the nucleic acid 
compounds of claims 1, 2, or 4 . 

6. A recombinant host containing any one of the 
vectors of claim 5. 
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7. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is an external target protein selected from Table 1. 

8. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is a hypothetical protein selected from Table 1. 

9. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is a cell wall synthetic protein selected from Table 1. 

10. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is a minimal gene set protein selected from Table 1. 

11. A DNA chip having arrayed thereon any at least 
15 base pair fragment of any one or more of the nucleic 
acids of claim 1. 

12. A DNA chip having arrayed thereon any at least 
15 base pair fragment of any one or more of the nucleic 
acids of claim 2. 

13. A method for evaluating gene expression in 
Streptococcus pneumoniae comprising the step of incubating a 
DNA chip of claim 11 or Claim 12 with cDNA prepared from 
Streptococcus pneumoniae under conditions suitable for 
hybridization of complementary nucleic acid sequences. 

14. A computer readable medium having recorded 
thereon any one or more of the nucleotide sequences of 
Claims 1 or Claim 2. 
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15. A method for identifying virulence genes in S. 
pneumoniae, comprising the steps of: 

a) preparing a DNA chip as in claim 11, 

b) preparing labeled cDNAs from 

5 i) 5. pneumoniae cells recovered from an in 

vivo environment, and 

ii) S. pneumoniae cells grown in vitro, 

c) hybridizing individually the cDNAs of steps 
(b) (i) and (b) (ii) to a chip of step (a); and 

10 d ) identifying a genomic DNA fragment or fragments 

on said chip that hybridize to the cDNAs of (b) (i) but not 
with the cDNAs of (b) (ii) . 

16. An antibody that selectively binds to a 
15 protein or peptide of Claim 3. 

17. An antibody that selectively binds to an 
external target protein, or fragment thereof, identified in 
Table 1. 

20 

18. A DNA chip of Claim 11 or Claim 12 further 
comprising a layer of 5. pneumoniae cells wherein said layer 
contacts with said nucleic acids. 
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